You are not logged in.

#1 2010-02-25 03:33:27

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,965
Website

Haskell: fastest way to read and print a file? & hGetBuf example?

I was playing around with simple IO in Haskell just for the sake of learning and decided to implement a very basic version of "cat" for a rough speed comparison. Being a Haskell noob, I tried various combinations of readFile and hGetContents from System.IO before I realized that they read a single Char at a time, which explained why there were slower than even Python and Perl. I found the hGetBuf function but I couldn't figure out how to get my data back from the Ptr. Eventually I ended up with this:

import System( getArgs )
import qualified Data.ByteString as B

main = getArgs >>= (B.readFile.head) >>= B.putStr
time cat /var/log/pacman.log | wc
297783 1945570 32536814

real    0m0.624s
user    0m0.620s
sys     0m0.017s



time ./test /var/log/pacman.log | wc
297783 1945570 32536814

real    0m0.671s
user    0m0.640s
sys     0m0.040s

where "test" is the program above compiled with "ghc --make test.hs -o test -O".


So, my questions:
1) Is there any way to shave off the last few milliseconds to make it as fast as cat for that operation?
2) Can someone give a clear example of how to use hGetBuf, specifically how to "marshall" a Ptr into a String or ByteString?

I've tried to find examples but the very few that I've found haven't been that helpful.


Slightly off-topic yet tangential questions:
*) Does anyone else find it irksome that you have to dig through Haskell's abstraction to deal with the underlying system? I'm only just beginning to learn it but I've already come across a few things which feel like square blocks hammered into round holes.


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#2 2010-02-27 04:18:57

skottish
Forum Fellow
From: Here
Registered: 2006-06-16
Posts: 7,942

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

Offline

#3 2010-02-27 16:57:35

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,965
Website

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

Thanks skottish, but it doesn't help me. The speed issue in that case is due to character encoding which shouldn't apply to a ByteString which, as I've understood it, is treated as raw binary data. The pointer example in that post never pulls the data out of the pointer either but instead passes the point on to an hPutBuf call which does it internally.


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#4 2010-02-27 18:48:04

skottish
Forum Fellow
From: Here
Registered: 2006-06-16
Posts: 7,942

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

Xyne wrote:

Thanks skottish, but it doesn't help me. The speed issue in that case is due to character encoding which shouldn't apply to a ByteString which, as I've understood it, is treated as raw binary data. The pointer example in that post never pulls the data out of the pointer either but instead passes the point on to an hPutBuf call which does it internally.

Well crapola. I don't know enough about Haskell to know what I was reading anyway. That's an issue that I will start to resolve in the next week or two.

One thing that would be very helpful is if we can start to draw some of the Arch-Haskell users/developers into the forums to help with posts like this. We have some expert coders that are helpful in other places that are so close to here (so to speak). This forum would be a great place to get Haskell hackers together.

Offline

#5 2010-02-27 19:52:19

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,965
Website

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

*nods*

Still, I should probably head over to the Haskell mailing list for such questions. I was hoping that this was relatively trivial and could be quickly answered here but it seems that Haskell is somewhat of a black art that many dabble in yet few master.


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#6 2010-05-06 17:06:50

CBM80
Member
Registered: 2010-05-06
Posts: 6

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

ATM this is only available in googles archives, nevertheless you might find it interesting.

Fast I/O in Haskell

Offline

#7 2010-05-06 17:25:48

brisbin33
Member
From: boston, ma
Registered: 2008-07-24
Posts: 1,799
Website

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

Xyne wrote:

Slightly off-topic yet tangential questions:
*) Does anyone else find it irksome that you have to dig through Haskell's abstraction to deal with the underlying system? I'm only just beginning to learn it but I've already come across a few things which feel like square blocks hammered into round holes.

i know what you mean (or i think i do...). 

as i understand it, since haskell is a pure language, anything dealing with IO needs to get through those monadic abstraction layers which every tutorial will tell you is a bizarre concept not for the faint of heart.  when i started trying to learn haskell (a process i need to pick up again), i decided to leave the monads for later and try to find a pure outlet for some haskell code. working through project euler, i found that haskell was extremely literal, simple, and easy since the problems are purely mathematic with no IO (other then a final show $ someInt) required. might be something worth looking into.

for reference, here's my euler page.

Offline

#8 2010-05-06 18:29:57

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,965
Website

Re: Haskell: fastest way to read and print a file? & hGetBuf example?

@CBM80
Thanks for the link. I've bookmarked it for now.


@brisbin33
I've been coding a bit in Haskell lately and it feels like I'm getting comfortable with it, including monads (e.g. state transformers, system IO). I think I'm over the first big conceptual stumbling block and it's starting to feel much more intuitive. Nevertheless I still generally agree with that statement, although now I would have worded it differently.

It feels like they've pulled a sheet over the underlying system. You can still see the basic shape of it and where different bits protrude, but you're supposed to pretend that it's all nice and smooth. If you want to do anything serious with it, you have to cut a hole through the sheet (i.e. use the foreign function interface and another language which can get to it) to gain full access.

When I wrote that I think I actually had Ints and Integers in mind. The difference between them is only how the underlying system represents them. From a purely abstract|functional perspective, there should be no distinction between them. I understand that the language is ultimately bound by the hardware and I actually appreciate that the distinction can enable the programmer to optimize his code. It just annoys me that there is a layer of abstraction that prevents me from exerting more control over the types. For example, it would be nice to be able to declare my own "PositiveInt" type and have it represented by an unsigned int in the underlying system. I can "see" from within Haskell how "Int" is represented, but I can't create a similar representation (without resorting to the FFI, afaik). There has been talk about this before and how to make all types user-definable. Some tutorial even give the dummy declaration of Int as "data Int = ...|-3|-2|-1|0|1|2|...". Obviously you would need some sort of meta-language to handle it though (which seems to be in the works).


For the record, I think the FFI is great and I see great potential in it and can't wait to start using it.


At some point I solved some of the problems on Project Euler (and got sidetracked with Perl and Python solutions). My quest to learn Haskell has also been periodic and so far I haven't gone back to PE but I intend to. Btw, I found it insightful to compare my solutions to the solutions posted in the wiki. They're full of little epiphanies.


*edit*
I realize that many apparent limitations are likely due to my own ignorance. I'm reading through the pages posted by CBM80 and already see some ways to get around the abstraction.

Last edited by Xyne (2010-05-06 19:03:17)


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

Board footer

Powered by FluxBB