You are not logged in.
I'm seriously looking at porting the initscript package to POSIX. I don't think it would be too difficult. There are two ways of handling the rc.conf issue:
The right way. Break the current format. Anyone using the package would have to modify their rc.conf. It could be scripted for most cases, but not for people that do funny things and IIRC rc.conf advertises the fact that you can use any bash evaluation if you wish so. Note I do, however, have an idea so that scripts expecting rc.conf to be the old way would still work.
The backwards-compatible way: Sacrifice a bit of speed and have the POSIX shell parse the rc.conf file, not evaluate it. This would be better for migration and perhaps eventually the current rc.conf style could be dumped.
Thoughts?
Offline
the right way. backwards parsing is not KISS, and will end up being a PITA.
Last edited by lloeki (2008-03-07 09:41:02)
To know recursion, you must first know recursion.
Offline
WARNING:
shift: 10: can't shift that many
this is what may happen if one uses pacman with /bin/sh as /bin/dash. end result: part of the system is b0rked.
That's when you install a package with a scriptlet, right?
Before, every scriptlets needed three lines at the end :
op=$1
shift
["$(type -t "$op")" = "function" ] && $op "$@"
As of pacman 3.1, these lines are no longer needed, which is great.
The only downside of this : backward compatibility was only kept with bash, it doesn't seem to work with other shell like dash or zsh.
So if you want to use dash as /bin/sh, you need to fix every scriptlets first
If you consider all scriptlets in core,extra,community and aur, that might take a while.
Though, a simple sed command should be able to take care of all of them at once.
More infos about that change in pacman here :
http://projects.archlinux.org/git/?p=pa … 439e42c41f
pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))
Offline
the right way. backwards parsing is not KISS, and will end up being a PITA.
Sorry if that's a stupid question, but what is the right way? ie which syntax would be used as a replacement?
pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))
Offline
The right way would be to change rc.conf so that instead of:
DAEMONS=(syslog-ng foo bar baz)
# Bash array that can be used as a true array (eg, ${DAEMONS[2]} )
we have
DAEMONS="syslog-ng foo bar baz"
# POSIX string that can be used as an array in a for i in $DAEMONS loop
At the end of rc.conf we could have a little compatibility script like this:
# Begin translation from POSIX /bin/sh to bashisms
if [ "$BASH" != "" ] && [ "$POSIXLY_CORRECT" != "y" ]; then
# We are running under /bin/bash or the like
# Emulate the old (bash specific) configuration file
eval "MOD_BLACKLIST=($MOD_BLACKLIST)"
eval "MODULES=($MODULES)"
eval "INTERFACES=($INTERFACES)"
eval "NETWORKS=($NETWORKS)"
eval "ROUTES=($ROUTES)"
eval "NET_PROFILES=($NET_PROFILES)"
eval "DAEMONS=($DAEMONS)"
fi
That will not get run on POSIX shells (including bash --posix or /bin/sh linked to bash) but will get run on /bin/bash and will translate POSIX strings into Bash arrays.
Offline
zsh test.sh 1.29s user 0.32s system 95% cpu 1.671 total
bash test.sh 2.06s user 0.26s system 96% cpu 2.403 total
dash test.sh 0.42s user 0.29s system 96% cpu 0.743 total
fish test.fish 18.33s user 35.60s system 91% cpu 58.819 total
What's up with fish? That's ungodly slow.
Offline
surprising, but not completely. fish's focus is the opposite of dash: fish is a easy to use interactive shell (thus scripting performance may not be on par) at the expense of posix compatibility, when dash is a performant and unbreakable non-interactive shell with posix conformance in mind.
I suppose fish may also be initializing some interactive stuff on each invocation (even in a script), like history (it's then sorting a b-tree) and/or autocompletion mechanisms.
maybe this abysmal performance can be explained by the author of fish... and may be worthy of a bug report upstream.
PS: really, I don't know, since I only discovered fish two hours before the above post was written...
Last edited by lloeki (2008-03-09 16:54:56)
To know recursion, you must first know recursion.
Offline
I have little to add except for my moral support for this enterprise.
Offline
I'm seriously looking at porting the initscript package to POSIX. I don't think it would be too difficult. There are two ways of handling the rc.conf issue:
The right way. Break the current format. Anyone using the package would have to modify their rc.conf. It could be scripted for most cases, but not for people that do funny things and IIRC rc.conf advertises the fact that you can use any bash evaluation if you wish so. Note I do, however, have an idea so that scripts expecting rc.conf to be the old way would still work.
The backwards-compatible way: Sacrifice a bit of speed and have the POSIX shell parse the rc.conf file, not evaluate it. This would be better for migration and perhaps eventually the current rc.conf style could be dumped.
Thoughts?
What's the point of all this. What's broken?
What in posix suggests that we must have a sh compatible initscript?
What do we gain for posix conformity apart from a silly fuzzy wuzzy feeling, breaking everyone's system, and more complicated initscripts?
Any speed up you'll see will probably be negligable. Nothing in rc.sysinit typically does loops with 50000 iterations so that benchmark is bogus. Profile the bootup. Except for udev, bash typically spends no significant amount of time doing much.
Fish: It's probably not really intended as a scripting language. If you filed a bug, the author would probably tell you to use bash instead.
Last edited by iphitus (2008-03-10 00:30:24)
Offline
gorn wrote:I'm seriously looking at porting the initscript package to POSIX. I don't think it would be too difficult. There are two ways of handling the rc.conf issue:
The right way. Break the current format. Anyone using the package would have to modify their rc.conf. It could be scripted for most cases, but not for people that do funny things and IIRC rc.conf advertises the fact that you can use any bash evaluation if you wish so. Note I do, however, have an idea so that scripts expecting rc.conf to be the old way would still work.
The backwards-compatible way: Sacrifice a bit of speed and have the POSIX shell parse the rc.conf file, not evaluate it. This would be better for migration and perhaps eventually the current rc.conf style could be dumped.
Thoughts?
What's the point of all this. What's broken?
What in posix suggests that we must have a sh compatible initscript?
What do we gain for posix conformity apart from a silly fuzzy wuzzy feeling, breaking everyone's system, and more complicated initscripts?Any speed up you'll see will probably be negligable. Nothing in rc.sysinit typically does loops with 50000 iterations so that benchmark is bogus. Profile the bootup. Except for udev, bash typically spends no significant amount of time doing much.
Fish: It's probably not really intended as a scripting language. If you filed a bug, the author would probably tell you to use bash instead.
Actually, the initscripts will be less complicated when bashisms ar removed.
The closer to posix the better. Those scripts can be used on other systems.
It's too expensive for Arch to apply for the UNIX label. Not having it, but conforming to it is a good thing.
Offline
What's the point of all this.
what's the fuss? please cool down. someone is trying to do something another way. nobody said it would replace current initscripts on the double.
What in posix suggests that we must have a sh compatible initscript?
nothing. that doesn't prevent us from trying.
What do we gain for posix conformity apart from a silly fuzzy wuzzy feeling, breaking everyone's system, and more complicated initscripts?
we're opt-in for breaking. what's more, to me bashisms are not KISS, while a simple posix compliant shell like dash is.
Any speed up you'll see will probably be negligable
we'll see, that's the whole point of the thing: moving away from lying benchmarks. besides, ubuntu has seen some improvement, so we might see one as well. who knows. what's more, performance bottlenecks in bash may not be only located in loops. only real testing will tell if it's worthy.
Nothing in rc.sysinit typically does loops with 50000 iterations
it does loop 63 times to set console font on vc1 to vc63. this is the second bottleneck besides udev for me.
EDIT: removed some stupid comment.
Last edited by lloeki (2008-03-10 08:43:26)
To know recursion, you must first know recursion.
Offline
what's the fuss? please cool down. someone is trying to do something another way. nobody said it would replace current initscripts on the double.
I'm cool I'm just working out what the point of this is. If you think it's a good idea, you shouldnt mind a few questions.
What in posix suggests that we must have a sh compatible initscript?
nothing. that doesn't prevent us from trying.
So.. doing this wouldn't give us posix initscripts, as there's no such thing. It'd just give us scripts that happen to be written in posix sh. Which in no way is better than any other language, such as bash, C or python.
we're opt-in for breaking. what's more, to me bashisms are not KISS, while a simple posix compliant shell like dash is.
So the bashisms will still be there, just if'ed out. That's not simple and ironically it wouldnt be posix sh compliant. dash will not accept that, it'll claim syntax error. At best, all these scripts would ever be is a separate initscripts package for those who'd like to feel fuzzy wuzzy. posix scripts isn't some big deal to aim for here, and won't make our scripts more portable in any relevant way. The kernel, and many of the userspace GNU tools, are not completely posix compliant.
we'll see, that's the whole point of the thing: moving away from lying benchmarks. besides, ubuntu has seen some improvement, so we might see one as well. who knows. what's more, performance bottlenecks in bash may not be only located in loops. only real testing will tell if it's worthy.
ubuntu have far more complicated scripts, where far more is done by bash itself. Barring udev as mentioned, little of the time our scripts take is actually consumed by bash.
it does loop 63 times to set console font on vc1 to vc63. this is the second bottleneck besides udev for me.
Given the bench above, (64*(2.17-0.714))/50000 = 0.0019 seconds. A noticable increase in speed. Significant increases could instead be found by making udev's blacklisting more efficient for example.
Anyway, good luck with the effort. I'm sure you'll find a solid practical reason for it. I'm happy to be wrong.
Offline
I have little to add except for my moral support for this enterprise.
Ditto.
Fish: It's probably not really intended as a scripting language.
It's not. It's over a year since I last used it, but if I recall correctly, that's why the author felt free to abandon both bash and POSIX compatibility in the name of "friendliness".
0 Ok, 0:1
Offline
I'm cool I'm just working out what the point of this is. If you think it's a good idea, you shouldnt mind a few questions.
of course not
So.. doing this wouldn't give us posix initscripts, as there's no such thing.
agreed.
It'd just give us scripts that happen to be written in posix sh.
absolutely. it just happens to be a requirement for dash to be used. the ultimate goal here is not to achieve posix compatibility, but making them run with dash (which is supposed to be faster).
Which in no way is better than any other language, such as bash, C or python.
speaking at the language level, indeed (although we could as always argue to no ends about which language is the best). at the interpreter level that's another story, and the goal here is to evaluate the supposed dash performance advantages over bash.
also, there is that minimalistic approach in dash that I happen to like, supposedly meaning it could be more reliable (like statically built, no dependencies, no useless features WRT (non-)interactiveness). also it may be less memory hungry, so it may help on running arch on smaller systems.
So the bashisms will still be there, just if'ed out.
uh, no they won't be there anymore. this is supposed to be a reimplementation of the current initscripts into strict posix, indeed supposed to be in a separate initscripts package.
won't make our scripts more portable in any relevant way.
well, portability is not really a concern here. after all, where else than Arch would Arch initscripts run?. it's not like one would want to run them on a BSD. the issue here is that dash has mandatory posix compatibility. posix is not the end, it's a requirement.
The kernel, and many of the userspace GNU tools, are not completely posix compliant.
I just hope they use #!/bin/bash where required so as not to break stuff if I were to use dash as /bin/sh...
ubuntu have far more complicated scripts, where far more is done by bash itself. Barring udev as mentioned, little of the time our scripts take is actually consumed by bash.
I hope either way that something can be done about it. it lasts for ~10s on my core 2 duo laptop.
0.0019 seconds. A noticable increase in speed.
indeed! I really think this slowness on that step is not at all bash related, but because setting a font on a terminal is costy. beforen then I was in the process to think of a configuration variable in rc.conf to support not having to set the font on every 63 vc, but only the first 6 or 12.
Anyway, good luck with the effort. I'm sure you'll find a solid practical reason for it. I'm happy to be wrong.
even if it proves to be useless, we'll have gained at least a better understanding of how the current initscripts work, and a training on how to achieve some work with a given constraint.
Last edited by lloeki (2008-03-10 11:32:07)
To know recursion, you must first know recursion.
Offline
I wonder how bash arrays could be scripted with pure posic-compliant shell.
Could someone give an example?
to live is to die
Offline
no arrays, so one has to work around that.
> cat arrays.bash
#!/bin/bash
array=('foo' 'bar' 'baz')
for i in ${array[@]}; do
echo $i
done
> cat arrays.dash
#!/bin/dash
array="foo bar baz"
for i in $array; do
echo $i
done
posix de facto restrictions:
- cannot find item n without walking through the n-1 previous ones
- items can't have (white)space in them
To know recursion, you must first know recursion.
Offline
how to get item n with bash:
echo ${array[$n]}
with dash:
get_item() {
j=0
for i in $1; do
[ $j -eq $2 ] && echo $i
j=$(($j+1))
done
}
example:
n=1
get_item "$array" $n
will return 'bar'
Last edited by lloeki (2008-03-10 14:12:35)
To know recursion, you must first know recursion.
Offline
funny thing.
running (from fish, which has no time command, thus the starting bash)
bash -c 'time ./arrays.bash' and bash -c 'time ./arrays.dash' (in any order) consistenly shows dash ahead of bash, factor twentyfold.
so I changed to array=($(seq -s ' ' 1 1000)) in bash and array=$(seq -s ' ' 1 1000) in dash, and added a get item 500 in each with their respective version. it turns out that even crawling through the 1000 items, then looking for item 500, dash outperforms bash by a factor 2. and if we comment out the 1000 loop, leaving only the lookup, dash and bash are on par. the funniest part being that if I increase n to 1000, the result doesn't change, they are still on par.
wow. sincerely, I would have expected dash to be much slower there, given we're 'manually' crawling way down hundreds of items, and doing arith calls and var assignments, when bash supposedly should just look up a pointer...
Last edited by lloeki (2008-03-10 14:27:00)
To know recursion, you must first know recursion.
Offline
no arrays, so one has to work around that.
> cat arrays.bash #!/bin/bash array=('foo' 'bar' 'baz') for i in ${array[@]}; do echo $i done
> cat arrays.dash #!/bin/dash array="foo bar baz" for i in $array; do echo $i done
posix de facto restrictions:
- cannot find item n without walking through the n-1 previous ones
- items can't have (white)space in them
this means rc.conf and other files will have to change their format to be posix-compliant
to live is to die
Offline
yup. this has been mentioned up here with a potential workaround to keep compatibility with current initscripts.
Last edited by lloeki (2008-03-10 15:33:17)
To know recursion, you must first know recursion.
Offline
So the bashisms will still be there, just if'ed out. That's not simple and ironically it wouldnt be posix sh compliant. dash will not accept that, it'll claim syntax error. At best, all these scripts would ever be is a separate initscripts package for those who'd like to feel fuzzy wuzzy.
I agree the bashims being if'ed out is not KISS. However what I posted above (using eval) lets it work in dash or bash just fine.
As for speed. It's not just loops. It seems that there is a fair amount of latency with loading bash compared to dash, which mainly matters for the udev module script as the rest of the scripts aren't called to frequently.
Last edited by gorn (2008-03-10 17:48:23)
Offline
iphitus wrote:So the bashisms will still be there, just if'ed out. That's not simple and ironically it wouldnt be posix sh compliant. dash will not accept that, it'll claim syntax error. At best, all these scripts would ever be is a separate initscripts package for those who'd like to feel fuzzy wuzzy.
I agree the bashims being if'ed out is not KISS. However what I posted above (using eval) lets it work in dash or bash just fine.
As for speed. It's not just loops. It seems that there is a fair amount of latency with loading bash compared to dash, which mainly matters for the udev module script as the rest of the scripts aren't called to frequently.
I support anything that makes udev fast.
Offline
the fastest distro need the fastest sh
isn't possible to 'patch' current files one by one with #!/bin/dash ?
so problems could appear one by one too, not kill the whole boot process.
or is it possible to have a separate package for all this stuff to be tried with no fear ?
the boot is already a problem for splash/animated/beautiful ones, now for dash use, maybe a flexible way could exists..
i can't really help, just post few ideas and agree with dash for boot is a good idea
Offline
OK, but how much speed can be gained by "dashing" only the udev scripts? Are there any problems expected with that?
Afterall, udev is probably the most time consuming, bash-based part of the booting process.
Last edited by cromo (2008-03-11 00:16:42)
Offline
I made a patch to current git doin the stuff and sent to Thomas.
I should have read this thread more carefully before sending it though. I used a compat wrapper script that got sourced:
[ -n "$BASH" ] && . /etc/rc.d/functions.compat
Would be better with the if and evals.
I run this on my laptop and I have my /bin/sh pointing to dash already.
Offline