You are not logged in.

#1 2007-10-14 23:50:37

sinister99
Member
Registered: 2007-04-10
Posts: 136

Looking for a "desktop search" app

I am using kde, and I am searching for a desktop search app that uses a database and searches within common filetypes.  I've tried a couple so far, and haven't found what I'm looking for:

kerry/beagle:WAAAY too much bloat; creates ~2gigs of logs per day, once crashed and started loading my filesystem with useless data.
strigi/clucene: tried it, didn't really work, settings kept on resetting to default.

I am looking for something that doesn't have heavy gnome dependencies.

Any suggestions / what are you using?

Offline

#2 2007-10-15 00:26:20

hussam
Member
Registered: 2006-03-26
Posts: 572
Website

Re: Looking for a "desktop search" app

Tracker is just GTK+

There used to be some KDE search application called Kat. I've never tried it but it may be what you are looking for.

Offline

#3 2007-10-15 00:29:14

buttons
Member
From: NJ, USA
Registered: 2007-08-04
Posts: 620

Re: Looking for a "desktop search" app

Tracker is the only one worth trying if you're looking for a lack of bloat.


Cthulhu For President!

Offline

#4 2007-10-15 00:41:09

Blind
Member
From: Desert mountain
Registered: 2005-02-06
Posts: 386

Re: Looking for a "desktop search" app

Google just brought out their Google Desktop Utility for Linux. I haven't tried it (and probably won't), but I guess it is worth a shot...
Cheers,
Blind

Offline

#5 2007-10-15 00:45:52

somairotevoli
Member
Registered: 2006-05-23
Posts: 335

Re: Looking for a "desktop search" app

^Really? <runs to check>

Offline

#6 2007-10-15 02:04:27

sinister99
Member
Registered: 2007-04-10
Posts: 136

Re: Looking for a "desktop search" app

Blind wrote:

Google just brought out their Google Desktop Utility for Linux. I haven't tried it (and probably won't), but I guess it is worth a shot...
Cheers,
Blind

I noticed that, but they only offer a rpm or deb archive, and I don't feel like manually installing it and verifying all the permissions.

I considered making a pkgbuild, but the license was sketchy.

Trying out tracker right now....I'll see how that works

Offline

#7 2007-10-15 03:11:56

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: Looking for a "desktop search" app

Please report back on Tracker. I've been thinking of switching from Compiz/GNOME back to FVWM and I'm looking for replacements for all my GNOME apps. Also, if anyone knows of a non-GNOME/non-KDE alternative for F-Spot please let me know (I prefer open source solutions if possible).

Offline

#8 2007-10-25 14:17:16

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: Looking for a "desktop search" app

Hi all,
I hope this is not hijacking the thread, but I would like to hear some more pros and cons. I know, I should just myself be checking them out, and I do have them installed now, but TBH I don't have the time right now and then again, maybe this should be documented somewhere. (Maybe we can put what we gather here on the wiki...?)

Currently I have
[*] Tracker
[*] beagle
[*] strigi
[*] Recoll
[*] Pinot
[*] Searchmonkey
[*] Google desktop
on my list.

I'd be interested in:
[*] which of these don't need a daemon running all the time? (Searchmonkey; Recoll: optionally)
[*] Of those that do, which are inotify-based? (Pinot, Tracker, Strigi: under development, Recoll)
[*] which of them needs (how much) hdd space for its index file?
[*] Which of them needs how much CPU for its indexing?
[*] RAM needed?
[*] HDD space for the install (excluding dependencies)?
[*] which backend(s)? (Recoll: Xapian; Tracker: Sqlite; Strigi: clucene and hyperestraier, sqlite3 and xapian are in the works; Pinot: Xapian)
[*] Daemon-Searchproggy communication via DBus? (Strigi, Tracker, Pinot)
[*] Xesam compliance? (Strigi, Tracker, Pinot, Recoll are mentioned as collaboration partners on the xesam.org site, but only strigi is listed in the implementations section. What gives?)
[*] RDF support?
[*] What dependencies are needed for each? In what Repo are they in Arch?
[*] What (common) filetypes are not supported? Plans about it?
[*] How quickly are we getting search results? and finally,
[*] How flexible are search options. Regex's, Phonetic similarities, Proximity search, (multilingual) word stemmers etc...
[*] Am I getting my search results along with their resp. context?
[*] What frontends/integration is there? Catfish, affinity, kio_slave etc.
More?
Oh yes, of course: any subjective experiences (e.g. install problems, search misses, plain happiness etc.)?

What do you think? (And yes, when I have the time, I am going to invest some of it into adding here.)

Andreas

PS. It dawned on me that I have consistently ignored beagle in all the above. That was not my intention. But I haven't done the research there yet.

PPS. My own needs: I have a few directories with /lots/ of text documents (journal articles, scanned books) in doc, pdf, html format that don't reflect my possible search patterns in their filename. That's why I have to search throught their content from time to time. OTOH, this is not very frequently and I am somewhat reluctant to either have a daemon running all the time and spend so many MB just on the index. Searchmonkey will be the first I am going to check, and see if it supports all file formats I need, allow me to search for what I want and is not tooo slow to return anything. (If it is, I suppose that means I have to have an index after all.)

Offline

#9 2007-10-27 00:58:34

Filosofem
Member
Registered: 2006-03-01
Posts: 28

Re: Looking for a "desktop search" app

isn't locate/updatedb already there for that? wink

Offline

#10 2007-10-27 02:32:50

Phrodo_00
Member
From: Seattle, WA
Registered: 2006-04-09
Posts: 342
Website

Re: Looking for a "desktop search" app

Filosofem wrote:

isn't locate/updatedb already there for that? wink

With a desktop search engine you also look inside the file and stuff like the metadata, not just plain old filenames.

Offline

#11 2007-10-27 13:28:20

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: Looking for a "desktop search" app

found two more: swish-e (in AUR, but orphaned and outdated, needs a couple of perl modules and external helpers, cmdline and cron useage, targeted more at webservers) and doodle (plaintext contents and meta information only, its daemon uses fam).

Offline

#12 2007-10-29 00:38:53

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: Looking for a "desktop search" app

Here are my first findings:

PINOT

All its dependencies are in official repos except for libtextcat and xapian-core (both in community, however).
Src download is 880Kb, and there's a package in [community]:
Targets: pinot-0.76-1 
Total Package Size:   1.36 MB
Total Installed Size:   4.05 MB

It runs a daemon that uses Xapian as retrieval and storage engine and is addressable via dbus.
You can specify which folders should be indexed, and which of those should be monitored, too. So while the dbus-daemon would watch your files and respond to queries, you don't necessarily need it running all the time but can use a command-line tool 'pinot-search' to query the xapian db directly. (But you'd have to kill the daemon manually after it has done its indexing IIUC.)

Most of what is not plaintext has to be converted with external tools (pdftotext, antiword, unrtf etc.) and you can extend those associations easily based on mimetype. (Of those, unrtf is in community, dpkg in AUR, the rest is in official repos). This conversion takes some resources, tho.

Pinot is integrated to deskbar-applet and to catfish (both via dbus).
On the other hand it is a frontent to its own db and to other search engines (e.g. firefoxs sherlock plugins) itself, both (rather lightweight gtk-)GUI and CLI.
It understands queries that conform to Xesam specs.


Experiences:
System was sluggish when it did the initial indexing, but that was expected. I didn't have the opportunity to see how it does in monitoring mode...
Because... after having worked for quite some time I had to kill it because it filled my home partition. I had given it a directory with ~3100 files, mostly pdf, a few doc and html files, weighing in at 1.5Gb. At the time I killed the indexer, it had filled 1.6Gb...
Thus I didn't try the search features yet. (I know I could have, but at the time I realized this, I had deleted the index directory already.)


So these are still open questions:
[*] monitoring-mode performance impact?
[*] search expression options: RDF, RegEx, Proximity, Stemming, RDF?
[*] search results: how quickly, how accurately, with context?

What I find interesting is its own frontend functionality, with inclusion of other search engines (you can also add 'foreign' xapian indexes), and the Xesam adoption/conformity.


Is this information of some use to anyone? (Next will be Recoll...)

Offline

#13 2007-10-29 07:55:08

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: Looking for a "desktop search" app

Next one...

RECOLL

Depends only on Qt and xapian-core (from community).

All its dependencies are in official repos except for libtextcat and xapian-core (both in community, however).
Src download is 850Kb, and since the package in [community] is outdated, I built one of my own:
Targets: recoll-1.90-1 
Total Package Size:   0.9 MB
Total Installed Size:   2.6 MB



Backend:
It also uses the xapian backend.
It can run as a daemon that uses FAM or inotify to provide continuous monitoring/indexing, but the docu says it's probably better to run a cronjob.

While Pinot was using an xml config file, Recoll has a very detailed set of plaintext config files and it can maintain several indexes in parallel (e.g. one with monitoring and one with cronjob indexing).

plaintext, html and email are handled internally, and for other file types you have to resort to external utilities again, of which unrtf is in community, pstotext is only in AUR and I didn't find id3lib/id3info at all.

Again, extension is possible. Here you have several steps which take resources and are slightly more complex to set up than in pinot, but probably allow more fine-grained tweaking: you have a mimeconf configuration where you associate mimetype and filter ('application/pdf = exec rclpdf'), you have an assortment of filters ready in /usr/share/recoll/filters, you have a mimemap config file where you can associate filenames/endings and mimetypes (Unfortunately it uses a different format from e.g. mutt's mime.types, but it also includes special, recoll-only settings - like exclude certain files from indexing or associate files with a certain name with a different mimetype when they're in another direcory)



Search:
Recoll has its own (qt, non-kde) GUI and is only adressable therewith, thus not integrated with any of the other frontends. (Export for pinot, which should be able to access recoll's xapiandb...)
You can use proximity and phonetic search, stem expansion and glob/regex expressions. And you can specify search criteria by metainfo (author, subject, keywords). AFAICT, Recoll does not (yet) understand queries that conform to Xesam specs, neither RDF.

Recoll remembers the last few searches that you performed. You can use a combobox to recall them. However, only the search texts are remembered, not the mode (all/any/file name). (Also documents actually pre-/viewed are remembered.)

Typing Esc Space while entering a word in the simple search entry will open a window with possible completions for the word. The completions are extracted from the database.
Double-clicking on a word in the result list or a preview window will insert it into the simple search entry field.

Search results formatting can be formatted freely with a qt-html expression per default they display several occurancies of the search term along with their respective, (very) limited context. They can be opened either with a mimeview configuration similar to the mimemap file from above or with xdg-open. Or you can view them in an external preview window (which has an incr. search feature). (You have icons for file types but no thumbnails.)



Experiences:
System seemed to me to not be so very sluggish when it did the indexing.
The same directory with ~3100 files, mostly pdf, a few doc and html files, weighing in at 1.5Gb. was indexed in
real    15m13.378s
user    8m47.466s
sys     3m33.439s

and additional stem databases for german and french took another 50 secs.
du -h:
14M     ./xapiandb/stem_english
15M     ./xapiandb/stem_de
13M     ./xapiandb/stem_fr
236M    ./xapiandb
246M    .

Recoll UI started up quickly and results were returned almost instantaneously. (Not a complicated search expression, tho) I have not yet tested for complicated expressions or a test search for a term that occurs only rarely, so I cannot tell if the small db size has its drawbacks.


So these are still open questions:
[*] monitoring-mode performance impact?
[*] search expression options: RDF ?
[*] search results: how accurately?
[*] attitude towards Xesam and offering services to other frontends?

What I find interesting is its configurability, and the small db size (well, as of now I have only pinot to compare it with, so maybe it's not so extraordinary after all).


(Next will be Tracker...)

Offline

#14 2009-02-17 21:26:30

Zibi1981
Member
From: Poland
Registered: 2008-01-31
Posts: 644

Re: Looking for a "desktop search" app

No report on Tracker? I'm particularly interested in this application...


"... being a Linux user is sort of like living in a house inhabited by a large family of carpenters and architects. Every morning when you wake up, the house is a little different. Maybe there is a new turret, or some walls have moved. Or perhaps someone has temporarily removed the floor under your bed."

MSI Raider GE78HX 13VI-032PL

Offline

#15 2009-02-17 23:40:57

SamC
Member
From: Calgary
Registered: 2008-05-13
Posts: 611
Website

Re: Looking for a "desktop search" app

Dead thread is dead. Please, people, read the date of the last post before replying.

Offline

#16 2009-02-18 17:33:40

anrxc
Member
From: Croatia
Registered: 2008-03-22
Posts: 834
Website

Re: Looking for a "desktop search" app

sinister99 wrote:

strigi/clucene: tried it, didn't really work, settings kept on resetting to default.

Lucky you, it saved you a lot more frustration (btw to fix this you should edit the config file manually, so try it again if you want, one more thing: the default config is bad because you get the feeling that it wont recurse into dirs, but it does, so keep that in mind when selecting your dirs). I tried it a few days ago and I have at least 25 bad things to say about it. To keep it short let's mention only the worst; officialy it supports PDF but not DOC, in reality it indexed only my DOC files and not PDF, another problem shows when it doesn't exit cleanly then it leaves a socket but on next start if it finds a socket it segfaults... Strigi is crap.

Edit: damn it. But it's sad, because it was bad then and it's bad today.

Last edited by anrxc (2009-02-18 17:34:37)


You need to install an RTFM interface.

Offline

#17 2009-02-19 12:26:55

Mikko777
Member
From: Suomi, Finland
Registered: 2006-10-30
Posts: 837

Re: Looking for a "desktop search" app

anrxc wrote:

Strigi is crap.

You spread FUD tongue
My bets are in strigi, It will prolly "pwn" in few months.

http://en.wikipedia.org/wiki/Strigi

Last edited by Mikko777 (2009-02-19 12:27:26)

Offline

#18 2009-02-19 13:10:09

alecmg
Member
Registered: 2008-12-21
Posts: 86

Re: Looking for a "desktop search" app

google desktop search suits my needs and isn't too heavy
I don't keep it running either


Xyne wrote:
"We've got Pacman. Wacka wacka, bitches!"

Offline

Board footer

Powered by FluxBB