You are not logged in.
Pages: 1
Link to pretzel's git: http://github.com/houbysoft/meaningcmp
Lexion (creator of Cookie):
Competition would be nice.
Well, the function at the core of this (mncmp()) was first meant to be a contribution for Cookie, but since Lexion's porting it to C it won't work there since it uses nltk (python), so I decided to write a simple wrapper around it.
Pretzel is a simple "AI assistant" like Cookie, but he/she's "smarter" in a way because he/she does not do just (chunked) string comparison, but tries to find out the similarity of the meanings of key/value pairs, because I think this is what matters; functions and custom commands such as notes, package installation etc. are pretty trivial to add later.
Pretzel for now supports only either giving you back a simple text reply, or running a shell command; in theory you can build everything else on top of that (notes, package installation, etc.). This allows it to be smaller and quite straightforward (in the source, which is as of now only 139 lines including all lines (comments also, etc.) in the mncmp function, and pretzel itself).
Small demo showing what I mean by meaning comparison (on the left is Pretzel, on the right is Cookie):
As you can see, in Cookie you need to enter the same command every time you alter your text a little bit; in Pretzel, you only do it once, and then similar sentences should be recognized afterwards (in the screenshot, Chrome starts after each command entered in pretzel except the initialization).
I hope you like it, feedback would be nice. It's on git only for now, maybe I'll create a PKGBUILD for it later:
http://github.com/houbysoft/meaningcmp
(warning : expect bugs, this is a version in a very early stage)
Offline
I tried to use it after installing python-nltk from aur, but get the following error:
$ ./pretzel.py 
Loading data...  [failed] (new database will be created after a clean exit)
Initializing mncmp()... 
Traceback (most recent call last):
  File "./pretzel.py", line 77, in <module>
    p = Pretzel()        
  File "./pretzel.py", line 36, in __init__
    mncmp("doors","walls") # need to pass anything through it once so that the wordnet dictionnaries get loaded etc.
  File "/home/me/builds/pretzel/mncmp.py", line 13, in mncmp
    s1_postags = nltk.pos_tag(s1_tokenized)
  File "/usr/lib/python2.6/site-packages/nltk/tag/__init__.py", line 62, in pos_tag
    tagger = nltk.data.load(_POS_TAGGER)
  File "/usr/lib/python2.6/site-packages/nltk/data.py", line 590, in load
    resource_val = pickle.load(_open(resource_url))
  File "/usr/lib/python2.6/site-packages/nltk/data.py", line 669, in _open
    return find(path).open()
  File "/usr/lib/python2.6/site-packages/nltk/data.py", line 451, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
  found.  Please use the NLTK Downloader to obtain the resource:
  >>> nltk.download().
  Searched in:
    - '/home/me/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************Trying to get that ressource was unsuccessful:
>>> import nltk
>>> nltk.download('taggers/maxent_treebank_pos_tagger/english.pickle')
/usr/lib/python2.6/site-packages/nltk/__init__.py:588: DeprecationWarning: object.__new__() takes no parameters
[nltk_data] Error loading
[nltk_data]     taggers/maxent_treebank_pos_tagger/english.pickle:
[nltk_data]     Package
[nltk_data]     'taggers/maxent_treebank_pos_tagger/english.pickle'
[nltk_data]     not found in index
False
>>> nltk.download(taggers/maxent_treebank_pos_tagger/english.pickle)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'taggers' is not defined
>>> nltk.download(english.pickle)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'english' is not definedI'm not familiar with this stuff, can you give me a hint on how I could get it to work?
Offline
Hi, I have the "book" collection in the nltk.download() installed, that should solve the problem, or you could install maxent_treebank_pos_tagger, but maybe that won't be all that is needed, so it's probably easier to just get everything that's used in the nltk book:
import nltk
nltk.download('book')I hope this helps.
Offline
Thanks, that did the job. Maybe a readme note on this would be helpful.
I tried pretzel a bit, and here are my first comments:
- at start up, could you let the user know that 'exit' will quit the program, at least for the first time (when you have the message that the knowledge base will be built at the end of the session). Also letting the user know where the knowledge base would be useful (I know you can just read the source code for that, but still).
- I noticed that the knowledge base got littered with empty commands. I think that when pretzel asks for an explanation, and only gets an empty string, nothing should be added to the knowledge base
- pretzel has a hard time with cli programs that ask for feedback. For instance sudo pacman -Syu won't go well, because you need to enter your password. In the same vein, if I launch vim from pretzel, usage of the editor is broken (input is not working). Can this be fixed?
- just wondering: since this is python, why not have (like /shell) a /python binding to add functionality on the fly? Using /shell for that would only create new instances of the interpreter, if I'm correct?
On another level, I'm wondering about the philosophy and capabilities of pretzel. You seem to favor the view that pretzel should be minimalistic, and the user builds his own environment. I like that, but I just wonder how far can one go:
For instance if I want to have a command that modifies the knowledge base, could I get pretzel to use the following:
pretzel.dat
kb: ~/builds/pretzel/pretzel.dat
edit: /shell vim
and then use
>edit kb
i.e. is it possible to combine commands?
Offline
BTW, thanks for sharing pretzel! 
Offline
Thanks for your feedback!
Unfortunately I'm leaving right now for a while so I can't implement your suggestions, but I will look into them this evening when I'll come back.
Offline
Sure, take your time.
Meanwhile, I've continued to think about pretzel and cookie, and I'm intrigued by the concept, even though at the moment they just look like bloated 'alias' replacements. On the other hand, this whole thing would take another dimension if the metaphor was changed, and instead of being a prompt, pretzel became a daemon:
Since you call pretzel an AI assistant, I would take you to the task, and expect from pretzel that it does the job of a secretary. You can also think of a mother, friend, SO, boss analogy, but to remain politically correct let's call it a secretary (and for the same reasons, let's use "it" for the secretary).
From a secretary, I expect that it not only carries my orders (pretzel> call my sister at home; pretzel> weather tomorrow; pretzel> news?), but that it also takes initiatives. So for instance it could be setup to ask me once a day 'how are you?', and file the response (or lack of it) in a kind of minimal diary.
Another example:
[00:00] <pretzel> reminder: it's time to switch off and go to bed
[00:00] <me> remind me in half an hour
...
[00:30] <pretzel> reminder: you should have disconnected 30' ago
[00:31] <me> whatever
...
[00:40] <pretzel> reminder: you should have disconnected 40' ago
[00:50] <pretzel> reminder: you should have disconnected 50' ago
[00:53] <me> cancel switch off reminder for today
note: since I think pretzel could be a daemon, probably it could behave like an irc bot, with the added advantage that you could actually interface pretzel to irc
3rd example: data mining the user PIM
a) using a location parameter, accessing pretzel via ssh
[14:44] <me> weather today
[14:44] <pretzel> Munich: Snow showers, -1ºC, Wind: W at 16 km/h, Humidity: 93%
[14:45] <me> I'm in Cape Town [wish I was]
[14:45] <pretzel> location updated
[14:45] <me> weather today
[14:45] <pretzel> Cape Town: Clear, 27°C, Wind: SW at 24 km/h, Humidity: 58%
b) checking mail activity for [family] contacts
[22:00] <pretzel> reminder: you have not written back to aunt Lisa for two months
[22:02] <me> called her last week; remind contact aunt Lisa next month
c) assuming the user uses GTD project management methodology:
[09:00] <pretzel> you have not reviewed your projects for two weeks. Would you like to review them now?
[09:01] <me> yes *shudder*
Whatever backends pretzel uses for its notification system (remind, agenda programs, mail filtering, etc), what matters is that it can itself initiate communication, which is a powerful way of interacting with (lazy) humans.
irc bot analogy: as a daemon, pretzel should be able to interface with irc, IM, phone, mail, etc. In fact, combined with voice synthesis and recognition, it could even have some real world applications, like helping visually impaired people, or allowing the user to call pretzel to get his business done over the phone, etc.
grammar
I don't know how one would implement a grammar, but I guess it would be necessary to have some system. You already have the beginnings of it (/shell). Should one develop such context tags (/python,  /contact, etc?), or is this a dead end?
Desired case study:
>call my sister at home
Pretzel should initiate a skype call to my sister at home, using the contact database entry for my sister to get the details
>text frank: how about going for a beer tonight?
Pretzel should find the cell phone details for frank at text him the message
All this depends on how pretzel manages to parse user input into chains of commands/arguments/etc
cpaste behavior
in IPython, cpaste offers a simple way to deal with multi-line input, which at the moment seems to be lacking in pretzel:
>mail sister: About next week's dinner:
>> Hi sis,
>>
>> I've talked to mom, and she's ok for a dinner at my place next Saturday. Would you be available?
>>
>> Take care,
>> --
pretzel> mail sent to sister
>
buffer management
Perhaps this is best left for the user to setup, or maybe not, but at least one should have a functional way of getting pretzel to manage buffers, à la screen or vim, to be able to deal with CLI applications that take over the screen and not just the prompt (vim, etc). But in fact at the moment even cli scripts that require feedback from user are a problem, as mentioned previously. 
reserved words, help, bugs
I noticed that typing '>list' will cause pretzel to exit. This is probably more of a bug than a feature, but in any case, it would be good to have an accessible list of reserved words (so far I only know exit and /shell). And in general some help sheet/system/man entry would be useful.
So far I've been skeptical and at the same time intrigued by these kind of programs, but perhaps there is really some potential for them to go beyond what a shell can already do?
Offline
Letting the user know about 'exit': done
Empty commands should be ignored: fixed
Vim / sudo problems: this was caused by programs being launched with a '&' appended, which seemed a good idea for launching Chrome and such (you probably don't want pretzel to wait until chrome quits to use it again), but it is true that some cli programs will then have problems. Therefore, I decided to remove it; you can always add the '&' for programs where you want it in the /shell command. Now it should work fine.
Python binding: added. Example usage would be something like:
/python print(self.pretzel_keys)to show the database.
Arguments: this is not yet possible, but it is definitively on the roadmap.
As for making it run as a daemon, initiate conversation, etc.:
That would be very interesting indeed, but also quite hard to implement; this would pretty much imply a true/general AI (this is not really AI yet - that's why I used quotes for "AI assistant" in the first post).
For the reminders, etc., my approach would be to use external programs and /shell. For example, one could make a (simple, but separate) program that would run in the background, read a file containing reminders, parse it, and print the output. This could be triggered by "remind me" or something like that. For adding reminders, one could use another separate program that would modify the file used by the program running in the background.
However, both of these things would require arguments which are not yet implemented, but I'll (if I get some time) work on that.
The reason for this approach of mine rather than what Cookie does (have it built right in its code) is that I think that if this ever gets complete and usable, people will want millions of such features, and possibly everyone a different set. Therefore, it's not really elegant to give everybody the same; rather, they should get a general "core" pretzel, and download "add-ons" as they see fit. This should also make it a lot more easier for others to develop these "add-ons" - they could be developed independently, without having to fork / push to the "official" git repo.
The same approach could be used for the email, for example.; mail sister: about next week's dinner: should launch a mail entry program with the arguments "sister", automagically resolved into your sister's email address (by the external program probably), and "about next week's dinner", which should be identified as the subject. The program then could ask, itself, for the body of the message, which could easily be multi-line.
For the skype/texts, arguments implementation would also be necessary.
Buffers: partially resolved now.
Bug: That is weird, it doesn't quit here, does it print some error? What's in your pretzel.dat? As for the help files, this is needed but I don't have much time right now. I'll do it at some point though. Feel free to write something if you feel like it, btw.
And again, thanks for your comments.
Offline

@y27:
I agree that the grammar is better in pretzel, but I have some new secretary features coming. Anyways, nice bot/ai/secretary.
urxvtc / wmii / zsh / configs / onebluecat.net
Arch will not hold your hand
Offline
@Lexion:
cool, it's always good to have multiple approaches.
Anyways, nice bot/ai/secretary.
Thanks 
@AlexS and others:
In a recent commit, I added the installation instructions to the README:
http://github.com/houbysoft/meaningcmp/ … 89922d5ec2
Offline
Update: in the latest git, arguments now (approximately) work.
It is not yet perfect, mainly because of the way nltk tokenizes strings; ie it thinks that in "test.txt", the dot means a new sentence, which breaks things since "test.txt" should be thought of by pretzel as a single word.
Nevertheless, it is now possible to use it for example for package installation:
Offline
Maybe this is too geeky - but could we hook up a speech to text input on it?
Libertarian Arch Linux User
Offline
Vim / sudo problems: this was caused by programs being launched with a '&' appended, which seemed a good idea for launching Chrome and such (you probably don't want pretzel to wait until chrome quits to use it again), but it is true that some cli programs will then have problems. Therefore, I decided to remove it; you can always add the '&' for programs where you want it in the /shell command. Now it should work fine.
Much better, thanks, now pretzel is usable. best to let users set up their own '&', that seems to be the philosophy of pretzel, anyways.
Python binding: added. Example usage would be something like:
/python print(self.pretzel_keys)to show the database.
sweet!
Arguments: this is not yet possible, but it is definitively on the roadmap.
I think this is the most important, at the moment. Will test your implementation later today, if possible.
Also, I think the README.txt is now much more helpful. I guess with usage I will find things to add to it.
As for making it run as a daemon, initiate conversation, etc.:
That would be very interesting indeed, but also quite hard to implement; this would pretty much imply a true/general AI (this is not really AI yet - that's why I used quotes for "AI assistant" in the first post).
Humm, don't think this requires a real AI (if one means by AI some self-learning algorithm), but rather a scheduler and to turn pretzel into a multithreaded application. Also, you would need to be able to queue the messages that pretzel creates over time. You could then use different front ends (daemonize, irc bot, curses application, etc).
For the reminders, etc., my approach would be to use external programs and /shell. For example, one could make a (simple, but separate) program that would run in the background, read a file containing reminders, parse it, and print the output. This could be triggered by "remind me" or something like that. For adding reminders, one could use another separate program that would modify the file used by the program running in the background.
That's how I see it too. You can use cron, remind, and/or your own scheduler written as a plug-in. Nevertheless, something must be built-in, and I think it's the multithreading and queueing.
The reason for this approach of mine rather than what Cookie does (have it built right in its code) is that I think that if this ever gets complete and usable, people will want millions of such features, and possibly everyone a different set. Therefore, it's not really elegant to give everybody the same; rather, they should get a general "core" pretzel, and download "add-ons" as they see fit. This should also make it a lot more easier for others to develop these "add-ons" - they could be developed independently, without having to fork / push to the "official" git repo.
amen
Buffers: partially resolved now.
vim works fine now, which is a must for me to use it.
Bug: That is weird, it doesn't quit here, does it print some error? What's in your pretzel.dat? As for the help files, this is needed but I don't have much time right now. I'll do it at some point though. Feel free to write something if you feel like it, btw.
It just quits python without an error (maybe because you pass exceptions?). In any case, I did not see anything special in my pretzel.dat, but unfortunately can't past it to you now (anyways it's almost empty, and 'list' is not present in the file). The bug is still happening with the version I pulled 12 hours ago.
Later I'll post you the pretzel.dat, or just delete it and check if it happens anyways.
On another topic, have you considered using the cmd module? it could give you some goodies right away (history, bash style history, a help system). Capabilities could be extended with cmd2 (Devling gave a talk on cmd and cmd2 at PyCon 2010). I tried to hack pretzel, but it's not working so far (probably some stupid mistake), but at the moment I have not time to dig further. You can check my effort at cmd.pretzel.py.
On the other hand, you may not like to refactor your program to use cmd, or not dislike that approach. Let me know so that I decide if I should waste time on it or not.
Offline
Just a quick reply until I get more time:
@matthewbauer:
Why not, if there is an open source program/library to do this, it should be very easy to hook it up.
@AlexS:
I don't really know cmd / cmd2, but from a (very) quick read of some google results I understand these are something like libreadline in C? If yes then sure, it is kind of annoying to not be able to edit (with the left arrow for example) the text right now anyway.
I'll watch the talk later, hopefully.
Last edited by y27 (2010-03-08 11:57:22)
Offline
@matthewbauer:
Why not, if there is an open source program/library to do this, it should be very easy to hook it up.
One could have a look at CMU Sphinx. Being able to speak on the phone with pretzel would be nice.
I don't really know cmd / cmd2, but from a (very) quick read of some google results I understand these are something like libreadline in C?
Yup, that's the idea.
Would still be interested in having cpaste behavior as implemented in ipython.
Offline
Another quick message; now pretzel uses the python cmd module as suggested by AlexS.
I learned the basics from your code, and then modified it so that less code needs to be rewritten (mainly, I did not use at all the precmd() hook).
Offline
Very nice! now we have history.
But please note that there is another bug: entering the Cmd reserved word 'help' causes pretzel to crash:
$ ./pretzel.py 
Loading data...  [done]
Initializing mncmp()...  [done]
> help
Traceback (most recent call last):
  File "./pretzel.py", line 98, in <module>
    p.cmdloop()
  File "/usr/lib/python2.6/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python2.6/cmd.py", line 219, in onecmd
    return func(arg)
  File "/usr/lib/python2.6/cmd.py", line 338, in do_help
    self.stdout.write("%s\n"%str(self.doc_leader))
AttributeError: Pretzel instance has no attribute 'stdout'even manually inserting 'help' as an entry in pretzel.dat did not help.
Also, I still have the odd behavior that preztel closes when I type 'list'. It closes cleanly, and it writes a pretzel.dat if there was none. Still don't understand what's going on, wonder if something related to nltk...
Tested arguments, and it works as advertised. Like you say it's very limited, it can't for instance combine two entries in pretzel.dat, etc (interprets the second key literally, instead of picking up the item). Do you have an idea on how to overcome this?
Offline
Fixed issue with 'help', will look into those two others later.
Offline
Fixed the issue with "list" in latest revision; it was due to setting minsim (minimal value required by mncmp() to think that two words are similar) to a value which was too low -- for some reason, wn.path_similarity returned (1/3) for "list" and "exit", so pretzel thought you mean "exit" by "list".
I have now switched minsim to 0.34, which seems to work well.
You can play with now easily anyway by changing self.minsim in pretzel.py, as I've added it as an optional argument for mncmp() (pretzel always passes it self.minsim).
Offline
Added possibility to "combine" commands, again this is very basic right now though; it is mainly limited by the current basic implementation of arguments.
Anyway, here is a sample conversation using the new feature:
~/devel/NLP/meaningcmp$ ./pretzel.py 
Loading data...  [done]
Initializing mncmp()...  [done]
> mail arg1
I don't know what to do!
Please enter desired response (simple text, /shell command, or /python command): I don't have any email plugin yet, but at least I know I should send this email to arg1.
> alice
I don't know what to do!
Please enter desired response (simple text, /shell command, or /python command): alice@wonderland.org
> mail alice
I don't have any email plugin yet, but at least I know I should send this email to alice@wonderland.org.
>Offline
After quite a big pause in development, I've finally done some work again.
I wrote a new system to replace the mncmp() function, now it's argmatch() in the argmatch.py file.
The main new feature is that "multi-word" argument matching now work. Here is an example which would not work in the last version. I also changed the argument system a little, from now on, arguments simply start with a ` (backquote) instead of the arg1..n.
~/devel/NLP/meaningcmp$ ./pretzel.py 
Loading data...  [done]
Initializing argmatch()...  [done]
> call `command with `args
I don't know what to do!
Please enter desired response (simple text, /shell command, or /python command): /shell `command `args
> call echo with Hello world!
Hello world !
>As always, any contributions, comments etc. are welcome.
Last edited by y27 (2010-08-07 17:39:20)
Offline
Pages: 1