Monday, June 22, 2009

Start Hacking Montezuma

It is very bad that I have suspended my study process of LISP for some weeks. I hope I should concentrate things which are meaningful and off those meaningless things, such as argue in forum and read entertainment story.

To follow Lesie's suggestion, I prepare to hacking Montezuma. First, I have read the treatise <An Object-Oriented Architecture for Text Retrieval>. It is a great paper, It uses an elegant, simple approach to accommodate a scalable complex architecture. I aslo understand Montezuma can not use the code samples.

I suppose fix bugs is a good start to get involve in a open source project, :).

standard tokenizer hangs on some input

As Edi Weitz pointed out, the culprit is the complex regular expression(method, token-regexp) in standard-tokenizer.lisp, and I have reduced the problem into a simple case:

CL-USER> (cl-ppcre:scan
              (cl-ppcre:create-scanner
                 "(_\\w+)*\\@\\w+") "_______________________________________"
                          :start 0)
;; Evaluation aborted.

I speculate that '\w' includes underscore in regular expression would account for this bug. and replace with other character of '\w' cause it too.

CL-USER> (cl-ppcre:scan (cl-ppcre:create-scanner
               "(a\\w+)*\\@\\w+") "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
               :start 0)
;; Evaluation aborted.

cl-ppcre is a perl-compatible regular expressions library, I should check it in Perl. Maybe perl is more efficient in regular expression operation, I raise the number of underscores, but it is OK.

$str = "john._______________________________________
__________________________________";

if ($str =~ m/(_*\w+)*\@\w+/)
{
   print "ok\n";
}

To conclude, it isn't montezuma's bug but cl-ppcre.

broken :must-not-occur or phrase query

I found query "html-template !\"edi weitz\"" is OK in my test corpus, but if I tried query "html-template !edi !test", it tell me:

"Invalid initialization argument: SCORER in call for class #<STANDARD-CLASS DISJUNCTION-SUM-SCORER>.".

It is obvious that there is not slot named scorer in disjunction-sum-scorer class, it maybe a typo, scorer should be substituted by sub-scorers. I modified it at line 199 of boolean-scorer.lisp, It is OK and all unit test are passed too.

Friday, May 15, 2009

ASDF-INSTALL and Cliki

This article should be posted before The annoyance of ASDF-INSTALL, but it is also the time factor of blog debut.

ASDF-INSTALL and Cliki works together as a Common Lisp's answer to CPAN, I use it to get Common Lisp libraries for a long time, but what I want to say is how to create your own package and tell it to Cliki.

  1. Make a gzipped tarball of the ASDF system, arranged to unpack into a subdirectory systemname_version.

    $ tar -cvzf systemname_version.tar.gz systemname_version/

  2. Cliki require that all packages are accompanied by detached PGP signatures.

    $ gpg -b -a mail-reader_0.1.tar.gz

    Note I have runned Linux in virtual machine for a long time and gpg complained it can not get enough random feeds when I want to generate a new key pair.

    $ gpg —gen-key

    Finally, I got sufficient random feeds by run this command in putty and run

    $ cat /etc/random

    in eshell in emacs in another putty process. I don't known why, it just works.

  3. Upload the package and signature file to a web server.
  4. Create a page on cliki with the same name as the package which content should contain

    :(package "http://www.example.com/lisp/mail-reader_0.1.tar.gz")

    Note you should not use "A N other" as Your name for editing, It let me stumble until I realize it. but why is it default?

Thursday, April 30, 2009

Some experiences for montezuma cl-markdown and cl-html-diff

I used these three libraries recently and want to write something down, :).

Montezuma

Montezuma is a full text index/search engine, What interested me is it has a port circle: Common Lisp -> Ruby -> Java -> Common Lisp, Why I can not use original version?

Montezuma offers a search-each method to search, but search-each only call a callback function to deal with search result, I have to mutate variables which violate function programming style. Another inconvenience is there is not a clean all index method, only a delete-document method. Besides it does not export get count of index method. I must use ::.

(montezuma::num-docs (montezuma::reader *wiki-index*))

cl-markdown

I use emacs-muse as mashup language of this post, however elisp is not popular, perl is, Markdown is a similar markup language written in Perl. I found cl-markdown as a Common Lisp port.

cl-html-diff

It is great Common Lisp always has library which I want to find. cl-html-diff generates a human-readable diff for html document, human-readable is provide two element del and ins, I do not know whether there are default style for them, I add two style to my css file.

ins{
    color: blue
    text-decoration: underline
}
del{
    color:red
    text-decoration: overline
}

Monday, April 27, 2009

Learn the use of Mercurial (II)

It should be series 0, but it happens before I open this blog so it delays as II, ;).

I said I have a flaky internet connection, though ISP claims it is very fast, but I believe GFW would use up lots of bandwidth. I do not want touch politics, but please let technology pass, OK?

People who lived out of my country do not has this problem, therefor they design software without considering it, it brings pains for me. I want to clone a repository which have contained lots of changeset. I power on my machine and let it start clone work in the morning, when I come back in the night, I found it stop work very early and my machine just let me paid power fee at the rest of time!

I seek help from Google, but there is only a awkward method: clone the nth reversion and incrementally pull m reversion at a time until the tip reversion. I wrote a script to get rid of input-wait loop.

#!/bin/sh
myvar=0
while [ $myvar -ne 11 ]
do
     myvar=$(( $myvar + 1 ))
     echo $((280 + $myvar * 25))
     hg pull -r $((280 + $myvar * 25))
done

Wednesday, April 15, 2009

Learn the use of Mercurial (I).muse

I have ever used VSS, CVS and SVN as source control management system. Most of time I only use some simple commands like checkout update, commit, diff, history. Now they are classified as centralized source control management system and marked as old fashion1, many distrubuted source control management system emerges like git, mercurial, darcs and so on. Now I come into mercurial.

Mercurial use hg as its alias: yes Mercurial is hg! I don't read chemistry symbol for quite a long time, :)

hg clone are is similar with svn checkout, they are the start point. It is easy to familiar with commit, diff, history,

So can I do some differently for this fashion product? yes transfer changesets from among repositories. It does not need a central repositories for syncing code between two development place, it is very useful when there is not internet connection in one place.

hg export and hg import are a pair operation. hg export need a revision number and export the changeset corresponding with this revision. hg import then merge this changeset into another repository. hg bundle and hg unbundle are used for changsets while previous pair only deal with one changeset.

When I'm in centrailized repository, I will be very careful for commit so that I does not break the availability of the whole repository, but now I'm in my repository, I could commit whenever I like, nobody will blame me. But when I am ready for push my work, in order to conceal my stupid things, or avoid mess other repository's log, or just reduce network bandwith, I need destory some commit track. Google tell me use hg strip, but I don't see it in hg help, then I tried hg rollback, unfortunately it can not roll back twice. Finally, I found strip is provided by the MqExtension. add

[extensions]

hgext.mq =

into .hgrc, strip command appears.


1. I believe old does not mean bad, :)

Thursday, April 9, 2009

Study Selector widget (II)

With inspiration from widget hierarchy, I found why my get-widget-for-tokens does not get URI tokens, on-demand-selector widget are mapped to "main" by navigation widget first, so I should visit http://127.0.0.1:8080/main/asdf not http://127.0.0.1:8080/asdf1. Besides, get-widget-for-tokens can not only widget but also consumed tokens, otherwise there is a page-not-found error.

Let me clear the whole update protocol for selector widget:

  • handle-normal-request get URI tokens from browser and call update-widget-tree.
  • update-widget-tree call update-children which is specialized by selector, widget, so get-widget-for-token is called with URI tokens.
  • if there is not http-not-found error which can be throwed by update-children, render current widget.
  • make sure all tokens were consumed, or report a page-not-found error.
  • Specially for on-demand-selector, it set widget returned by get-widget-for-token as its children and cached, so it best fit dynamic wiki-style content creation

1. navigation asdf is a arbitrary URI tokens just for testing.

I found the reason why dataform widget get not updating

It is not your fault, I'm sorry for modifing you to conceal my mistake. I redefined with-widget-header method of widget which contains you, but I do not use recommended way, :(.

here is with-widget-header documentation:

"Renders a header and footer for the widget and calls 'body-fn' within it. Specialize this function to provide customized headers for different widgets.

'widget-prefix-fn' and 'widget-suffix-fn' allow specifying functions that will be applied before and after the body is rendered."

What I need is a customized footer for widget, I add my code into body of the method arbitrarily, so you get not updating when your status is changed. I should use widget-suffix-fn parameter!

No hasty coding at all!!!!

Tuesday, April 7, 2009

Study Selector widget (I)

on-demand-selector says it implements the dynamic wiki-style content creation, oh it is great! It is what I want: a simple wiki. but how it solve the second problem in The render widget mechanism in Weblock

I study on-demand-selector's source file first, but it is very short and I get nothing. I switch to its parent widget: selector, it is about a mechanism for mapping URI tokens to widgets. I know what is URI and token, but what is the meaning of URI token? URI's definition like A/B/C/D, so I think A, B, C, D all are tokens and should be passed by their order.

At the beginning, I think on-demand-selector is used to replace my wiki-widget, but one widget need one URI token, How can I provide those URI token? After some testing, I found there is only a default URI token nil if I do nothing with it. I discover lookup-function will return remaining tokens. oh, if given a default URI token, let it return the first page-widget and set remaining tokens for next cycle to return other page-widgets, it seems OK. Unfortunately, The mechanism is not what I believe. I do not what is the purpose of remaining tokens.

Sort my thought here for better progress, :).

Sunday, March 29, 2009

Meet Mr Clbuild

I leaved Mr ASDF-INSTALL after a long time work with he and come to meet Mr Clbuild.

I follow instructions of its website, download by darcs

$ darcs get http://common-lisp.net/project/clbuild/clbuild
make the shell script executable:
$ cd clbuild
clbuild$ chmod +x clbuild
Try check to make sure all its helper friend are here
clbuild$ ./clbuild check

It report I have not 'which', what is which? I log in mystic server in which another Mr Clbuild stays and see which. which produce the path of it parameter. oh! whereis also do this work, I make a link from whereis to imitate which. Unfortunately whereis produce the path of all binary, library, manual files, even whose its parameter is print first so whereis -b also is wrong.

I have to install which command. Good luck, which command is also which package in Arch Linux.

Go on!

clbuild$ ./clbuild install cl-ppcre
No problem! then
clbuild$ ./clbuild slime-configuration

I love emacs and live in it for a long time, so I can not give up slime. I do not want to tousle my .emacs, so I use slime-configuration instead of slime. It prints a .emacs excerpt. I'm very careful and check it before copy&paste it into my .emacs: It try to load slime which I do not install by clbuild.

clbuild$ ./clbuild install slime

so after some greeting words with Mr clbuild, what is my purpose to meet clbuild? Let he invite Mr Weblocks as my friend, :).

Mr weblocks is in the list of

clbuild$ ./clbuild projects
clbuild$ ./clbuild install weblocks

Mr Weblocks has lots of friend and my slow and GFWed1 ISP interrupt the party, Mr Clbuild tell me to run

clbuild$ ./clbuild update --resume

I regard it as a ordinary problem, but it is not! Why I always be out of luck, :(

(require :weblocks)
....
Symbol "DEFINE-FOREIGN-LIBRARY" not found in the CFFI package.

I grep "define-foreign-library" in cffi source directory, there is not. I check its version, it is 0.1.0, but my ASDF-INSTALLed cffi is 0.10.3.oh no! After struggle for some time in switch between Mr ASDF-INSALL and Mr Clbuild (It seems clbuild is not compatible with ASDF-INSTALL), I finally resolve it by reinstall cffi by clbuild.

These are my first meeting with Mr clbuild, I hope he will be a good friend of mine, :)


1. Great FireWall

Wednesday, March 25, 2009

I can not stop weblocks

I found it very early and I judge it is becuase hunchentoot1 made a big step: from 0.15.8 to 1.0.12, it uses a new class acceptor which I do not comprehensive it very much. So I skip this problem for I can say sayonara to slime. I think it just waste my time: it takes time to run slime.

However Murphy's law came, I found my data was mixed when I modify data in the course of exiting and entering slime. I first suspect the algorithm of modification, sigh, it is waste of time again, It is just due to close-store is not called. I believe this work could be done in stop-weblocks which I ignore. I have to debug stop-weblocks and discover it enter a endless loop when to shutdown a one-thread-per-connection-taskmaster object, I do not know why but I redefine the shutdown method, not to waiting the process natural dies but kill it by sb-thread:terminate-thread.

It works and It is obvious I need check this problem after the state of hunchentoot and Weblocks development are stable.3


1. Weblocks is based on hunchentoot.

2. It is very pity hunchentoot 1.0 does not support \*catch-error-p\*, it is said it just absent for a little time.

3. Maybe it is my configuration' wrong, but it does not make sense now.

Monday, March 23, 2009

The render mechanism in Weblocks

Now I'm working on a simple wiki application for weblocks.I create a wiki widget, it show pages in a single place,(Maybe pages should be linked from each other, but it is a simplewiki now, :)). User can click the page, modify its content and submitit.

I used a widget which inherit dataform to reflect page,because dataform has two state, one show content, one is a form which you can modifycontent and submit. I tried it and met a problem, when user click "Modify" link of dataform widget, its state is changed but not its appearance. I think it is beacuse thewiki widget is not rendered. The page widget is stored in the slot of wiki widget,the wiki widget does not konw whether it need be render. so if I put page widgetinto the children of wiki widget?

I keey trying, but it does not work :(. I just got a advantage that parent widgetcan render children widget automatically.

I know I must mark-dirty a widget to let it being render. It is a pity thereis not a hook function in the modify link of dataform widget. I have to modifythe render function of page widget in order to use a accustomed link to replacethe modify link.

It is my surface impression of The render mechanism in Weblocks, :). oh, anotherthing, It seems children widget's render-widget does not call its parents functionautomatically, I need use call-next-method mannully.

The annoyance of ASDF-INSTALL

When I enter the LISP world, someone tell me ASDF is likeMakefile world but it is written in LISP. yes,the philosophyof LISP world is do everything in LISP for it can do everything.It is great!

Then it go further, ASDF-INSTALL is taken to manage LISP packages,we do not need APT or RPM package system provided by OS at all, wejust finish it in our world. However it is not great!

ASDF-INSTALL uses PGP signatures to verify the provenance ofthe downloaded code, it seems OK but there are so many packages andevery package developer has his/her own PGP key. I need add themto package supplier list one by one and trust them one by one. Unfortunately,ASDF-INSTALL does not provide a restart to make a PGP trust relationship.Maybe we should use a signatures mechanism written by LISP.

I think even thought we have a central-control site Cliki to getall ASDF package, we trust it, why does not it provide a trust bridgebetween us and the developers. More trust, Better world!

It is good that ASDF-INSTALL deal with package dependancy but not packageversion. you need check it manually: discover it from lots of BIG (I was not used tosee every character is captical, I'm not a native english speaker)sentencs of Backtrace, and skip all uselessrestart (I tried it but it recurs in next time, besides they may cause anotherproblem, who knows!) The worse situation is you get it from your code.

I wrote this post not for hating it but record my experience for it costed melost of time to struggle: I met a problem, in order to resolve it I need resolveanother problem like GPG warns and GnuPG installation, package version dependancyand BIG Backtrace. This experience maybe upset but it is useful for my post, :).

Sunday, March 22, 2009

Why I open this BLOG

To share my experience in Lisp world to help other people?No! I need help first! Of course if my post can help others,I will be very very happy!

After I came into the LISP world, I met lots of problems, thefirst reaction is should I ask it and get a quick answer? Itseems simple, however whether or how to ask question itself is a problemif you know little about the problem, you don't know how to ask it precisely.moreove, I'm not a native english speaker, sometimes I do notfind a proper word or sentence to express my meaning. Even worse is whenI keep ask frequently, other people will be annoyed and so do I,I will be lost for unsureless.

But if I do not get help from expert, I must waste times in many trivialproblem. So I think the BLOG maybe good for me, I tried my best toexpress my problem or experence in detail, and if you feel it is agood discuss point, please comment it and if it is not, just skip it.I hope it is a convient communication way.

Beyond it, BLOG should give me some good: When I can not solvethe problem, I feel sad and do not want to touch it for a periodof time. now I can try to describe it clearly, it is also a success whichcan boost me.

In the end, Specially thanks Leslie P. Polzer, He gave me many help when Istruggle in the LISP world, I'm very sorry I'm not very vigorous before,I should do better.

PS: If the grammar or semantic of my posts has wrong, I will appreciate yourtaking the time to point it, It is a good way for I practise my english, :).