I've had an idea for looking up compounds in edict2 for a long time (almost two years). And I've finally written a GUI for it.
It's an addition to an IME I wrote two months ago called hayanyuu.
A small introduction to how hayanyuu works:
edict2 is searched by romaji to hiragana conversion. Results are selectable by function keys. Choices are then added to a "Line so far". This line can then be accepted with Enter and it will be copied to the X selection and typed with xdotool.
The idea behind compound search is that you usually only need a few radicals from each kanji to get a unique match for a compound.
For example, there are many radicals in these two kanji:
運 (movement, hat, car)
勢 (earth *2, legs, nine, a comma thing, power)
But if you just throw some common ones against the compound: [movement & hat] + [earth], it turns out there's only two edict entries that match! Even though there must be tons of kanji that have "earth" in them.
I made it work with aliases of radicals.
There are two special characters you can enter into the romaji bar: ? and `
? looks up all radicals and their aliases of all kanji:
The last ? makes it search.
` searches edict2:
`move hat - earth -`
The last ` makes it search.
You can also fill in hiragana with capitals. Wildcards with . and * . And a repetition mark with &. It also takes kanji.
The - turns the last few lowercase aliases into a group of possible kanji.
It works by a grep search of edict2. A group of possible kanji is actually just [KANJI KANJI KANJI], and, if you're familiar with edict2, * is actually [^(); ]* so it's bound to the entry. It's normally also bounded to the entry. If you search for just `勢` you won't get the "運勢" compound. You can do `.勢` though, but there's a ton of results.
Configuration of the aliases is done by editing the script itself. I haven't made a good aliases list for myself yet.
It's really something you want to do yourself anyway. People see different things in radicals. Multiple aliases are possible, so share what you change and I'll add it to future versions.
More information is on my blog: http://archlinux.me/procyon/
Let me know if there's anything you'd like to see added.
Last edited by Procyon (2011-08-19 20:16:58)
Nice tool, and a good idea! Works quiet fast for me, too.
One thing I would like to change: only show "Line so far" if user is actually interested in viewing it, not during and after each and every addition.
It would be nice to have a) katakana, b) romaji output for "Line so far" as well.
Under which license do you publish this script? Maybe in the future I'd like to take some ideas and write something up which is suited 100% to my taste and requirements.
Thank you for your work
Thanks, I'll look into your suggestions.
So far I've only changed a few aliases.
I use the program a lot though, so I'll improve it where I can.
I'll probably be able implement your suggestions without trouble.
I think I'll go for:
Rebinding Enter to look up entries in edict2 when there is romaji on the line and, if there aren't, accept the "line so far".
Bind F3 to add romaji/verbatim.
And I'll also look into removing the limit for the amount of results and adding bindings for Shift+F1 etc.
kakasi-cvs is used for hiragana to katakana conversion.
Feel free to write what you want from the ideas in this script.
I'd really like to see a better implementation too, because I use it a lot myself (mostly the "radical to compound search" when transcribing manga).
Version 2: http://sprunge.us/ijEH
This implements the above. (Enter to search, F3 -> romaji, extra results with Shift/Control + Function Keys)
And ` and ? no longer have to start with those characters. Just type ` or ? at the end instead of Enter.
Shift and Control work slightly different for urxvt and xterm.
I got something consistent by not using F11 & F12, so when it says SF11/CF11 it means:
SF11 -> Shift + F1
SF20 -> Shift + F10
CF21 -> Control + F1
CF30 -> Control + F10
English meaning is attached to the results now. And for radical to compound the reading as well. There is a terminal escape sequence to stop this from wrapping to the next line, but it doesn't work in multiplexers. I think I'll just add it in the next version.
If radical to compound is still vague, here is a screenshot: http://ompldr.org/vOXZlbQ
The part between ": KANJI :" will be put on the "Line so far"
Use "KANJI?" to see aliases quickly. Maybe I'll add a cheatsheet in the next version.
Last edited by Procyon (2011-08-19 23:47:10)
I added English search, some visual fixes and more aliases.
%: will search edict2 for those words exactly (AND search), for English meaning search.
Control+E: edit the history file with vim. The top line will now display the history in reverse order, space delimited.
Control+V: paste clipboard (xsel -b)
ALT+0-9: Show radicals and their aliases that have that many strokes.
Did it work okay?
If there's something you're missing, maybe I can add it.