You are not logged in.

#1 2010-11-30 17:27:04

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

KanPeke - Kanji drill exercises in Bash

I've been practicing Japanese and wanted to share what I wrote. It's a simple driller and there are tons of programs just like this and much better, but what I really wanted was to spend a long time on large practice sets, with dictionaries that make sense, a quiz where everything has to be typed out, and not much more.

I will write new dictionaries of around 300-400 characters as I go through all the kanji in principal radical order. The first dictionary has all the 1- and 2-stroke radical kanji, 274 in total. The reason for this order is that the principal radical is often easy to spot and when adding more dictionaries I want to make easy to differentiate sets in learned memory.

Latest version
2011-02-12
wget -O kanpeke http://sprunge.us/FRMV; chmod +x kanpeke

Dictionary set 1
All:
wget -O kanji_by_radical_0001-0274.quiz http://sprunge.us/XGXP
Top 1000 used by newspapers (selection of the above dictionary)
wget -O 1000newspaper_0001-0274.quiz http://sprunge.us/UZZP
Those not in the top 1000:
wget -O not1000newspaper_0001-0274.quiz http://sprunge.us/KUHA

Dictionary set 2 (The compounds are still missing)
All:
wget -O kanji_by_radical_0275-0635.quiz http://sprunge.us/Cdde
Top 1000 used by newspapers (selection of the above dictionary)
wget -O 1000newspaper_0275-0635.quiz http://sprunge.us/NZGb
Those not in the top 1000:
wget -O not1000newspaper_0275-0635.quiz http://sprunge.us/YARE


Features
- The answer is checked as you type. No need to press enter and when you make a mistake you will fail immediately.
- Even long English words have to be typed like this (one typo is allowed), because I don't find multiple choice questions to help learning at all.
- Mistaken kanji are repeated twice, after N and N*5 questions. You can change this amount at the beginning.
- It asks to change font (useful for urxvt) and terminal size.
- There are four exercises:
1. Principal reading
I selected one reading that I found to occur most frequently in the example sentences of edict2 with this site. When in doubt I preferred the ON reading.
There are deliberately no hints at all in this mode. When you get it wrong you will see whether it was looking for the ON or KUN reading, each kanji only has one entry in this mode though. ON is in capitals, and KUN readings often have a dot and okurigana. The dot doesn't need to be typed.

2. English meaning
This started out too being taken from edict2, but I changed almost all of the meanings to use the definitions in the Kodansha Kanji Learner's Dictionary. These favor a meaning that matches the kanji's use in compounds instead of its original meaning.
Because of that there are some kanji that have multiple meanings. They can be entered in any order.
Except for the first letter, you can make one mistake.

3. All readings
This exercise is the hardest but you do get a lot of hints. Before my first run of this exercise I went with what seemed like the most useful compounds (one compound for every reading of a kanji), but the difficulty was way too high so I changed everything to use only compounds where all kanji are in the current or previous dictionaries, unless the specific reading of the kanji only occurs together with outside kanji.
I also used a few compounds with very common kanji, kanji that have the same reading as one that looks a lot like it in the current set (I actually prefer to use these), some words that I have heard a lot before, and some that just sounded fun.
Of course that means that several entries are barely ever used in Japanese. For compounds consider them stepping stones for later, and for useless readings try to group them together with kanji that sound the same.
Apostrophe is converted to n.
I will leave English meanings and more useful compounds for later compound specific dictionaries.

4. Stroke order
The more common way of typing out how a kanji is written is with principal radical and other radicals. They all have names and numbers and such.
As I was thinking of the easiest way to implement this, I came up with a new way to represent the kanji and that is by writing it out using "stroke-like" keyboard characters.
For example: 危 becomes ,7-/]l (the last one is an L)
I made a string like that for every kanji, with many revisions as I was making up more rules. If you question one, consult a site with stroke order, and if you still don't agree please let me know because I want these to feel as natural as possible.

Last edited by Procyon (2011-02-12 20:07:32)

Offline

#2 2010-11-30 19:08:59

Awebb
Member
Registered: 2010-05-06
Posts: 6,688

Re: KanPeke - Kanji drill exercises in Bash

I definitly like the idea. Will try it tonight.

Offline

#3 2010-12-03 23:28:16

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: KanPeke - Kanji drill exercises in Bash

Some news related to stroke order. I noticed that since yesterday jisho.org has new stroke order images including where the stroke starts, and this information is all from http://kanjivg.tagaini.net/

There is an xml archive on that site, but I'm not sure if it's parseable to the ascii lookalikes I used. It seems to group things together a bit too much relying on the drawing coordinates specified after the stroke character.
For example 任
I turned this into /|=-|-
KanjiVG uses ㇒㇑㇒㇐㇑㇐

It's also more specific about small dots and how some strokes hook at the end. That last one is something I wanted to go into more detail in too, especially for ㇖ in for instance , but there are only so many stroke-looking keys on the keyboard.

A comparison of that last kanji:
My version: ,-|]-,--j
KanjiVG: ㇑㇐㇑㇕㇐㇔㇖㇐㇚

Anyway I had fun writing the strokes of the first 300 kanji so I will continue to make up my own (keyboard-friendly) versions.

I'm still practicing the first set. As soon as I'm done I will make a quiz file of all 3-stroke radical kanji.

I also paved the way for different types of quiz files by putting parsing information inside the dictionary in the first 7 lines (see the new How To).

I was thinking of making a compound and a one- and two-kanji verb quiz that don't have kanji outside the current set. Getting these entries from edict2 is not that hard using grep, but you will probably get banned from Google trying to find compounds that don't return enough hits to be worth including in a list of 2000.

Even though there are many, it shouldn't be too hard to turn the edict2 entries into 3 quiz files so I might do it anyway even if it's just for some more exposure to the current set.

Offline

#4 2011-02-11 17:52:55

Sigi
Member
From: Thurgau, Switzerland
Registered: 2005-09-22
Posts: 1,131

Re: KanPeke - Kanji drill exercises in Bash

Ha, started to learn Japanese two weeks ago. Kanatest was cool to learn hiragana and katakana, Kanji is up next :-)  I like the idea of kanpeke and will try it out! Thanks so far!


Haven't been here in a while. Still rocking Arch. smile

Offline

#5 2011-02-11 18:22:52

flamelab
Member
From: Athens, Hellas (Greece)
Registered: 2007-12-26
Posts: 2,160

Re: KanPeke - Kanji drill exercises in Bash

Sigi wrote:

Ha, started to learn Japanese two weeks ago. Kanatest was cool to learn hiragana and katakana, Kanji is up next :-)  I like the idea of kanpeke and will try it out! Thanks so far!

Οοοh, kanatest ! i didn't know that such a program existed ! thank you, it seems to help on learning japanese characters

Offline

#6 2011-02-12 20:46:05

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: KanPeke - Kanji drill exercises in Bash

I updated the first post with some new types of dictionaries for set 1 and 2 and fixed a bug in the first dictionary.

I recommend using the top 1000 most used in newspapers dictionaries. Here is the code to make your own:

grep "$(grep "$(cut -f1 kanji_by_radical_0001-0274.quiz)" kanjidic.utf | sed -n 's/\(.\).* F\([0-9]*\) .*/\2\t\1/p' | awk '($1<=1000) {print "^" $2}')" kanji_by_radical_0001-0274.quiz > 1000newspaper_0001-0274.quiz

It needs kanjidic, from the edict family of dictionaries:

#! /bin/bash
for dictionary in edict2 radkfile kradfile kanjidic examples; do
curl -s ftp://ftp.monash.edu.au/pub/nihongo/$dictionary.gz | gunzip | iconv -f eucjp -t utf8 > $dictionary.utf
done

While practicing I recommend using an accompaniment. Like Heisig's Remember the Kanji or Kanjidamage. I prefer an etymological approach like www.kanjinetworks.com. The search form on that site only works when logged in, but you can make a "Keyword" search in Firefox for it which works even when logged out.

Offline

#7 2011-02-13 08:32:32

Jimi
Member
From: Brooklyn, NY
Registered: 2009-09-25
Posts: 125
Website

Re: KanPeke - Kanji drill exercises in Bash

This looks amazingly useful--Can't wait to try it out!

Thanks for the great project.

Offline

#8 2011-03-12 22:58:09

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: KanPeke - Kanji drill exercises in Bash

I converted the dictionary of 1-, 2-, and 3-radical kanji that are in the top 1000 of newspapers to an Anki importable format:
http://sprunge.us/jfWX
To a basic deck: [Front:Compound] [Back:Reading\n\nEnglish]. I haven't figured out how to do more in Anki.

I got rid of principal meaning/reading and the kanji itself and just used all the compounds. Out of convenience, I mostly used edict2 to find the English meanings of compounds. However I got a lot of them originally from the Kanji Learner's Dictionary, and I'll try to look them all up in there too. Alc is also a great dictionary, with much better English definitions than edict2 with entire sentences, but it has a strict anti-copy policy.

I also left out the stroke order drawings. If you reverse the cards, I wonder if there is always enough information to know the kanji, with a few similar KUN/meaning words. From there you could imagine drawing it, or actually draw it in a notebook. At least, I thought drawing kanji (even with keyboard strokes) was helpful.

Offline

#9 2011-10-16 13:06:14

awayand
Member
Registered: 2009-09-25
Posts: 398

Re: KanPeke - Kanji drill exercises in Bash

what is the advantage of using this as to say, for example, anki+heisig?

Offline

#10 2011-10-16 14:20:39

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: KanPeke - Kanji drill exercises in Bash

Sorry to say I won't continue this project.

Anki has a mode where you have to type the answer. (It's not as unforgiving as Kanpeke though)

I'm sure any advantage of the subject matter comes down to novelty. If you like Heisig order, go with that. Or make your own anki deck.

Being able to write a kanji from memory is important. The stroke order training of Kanpeke helped a bit, but you should try doing that with pen and paper, probably.

Learning one example of each reading was pretty useful for me too. I definitely recommend doing that if you make your own deck.

Offline

Board footer

Powered by FluxBB