OPAC screen-scraping garbage characters

From: Kenneth R. Irwin <kirwin_at_nyob>
Date: Thu, 29 Apr 2004 12:33:23 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
Hi folks,

I just started tinkering with Expect to talk to our OPAC, and I'm finding
that when I look at the resulting log files I get a lot of garbage
characters eg:
[H[2J[1;11H*** INNOPAC
(for more see log: http://www6.wittenberg.edu/lib/logfile.txt )

I see these characters in more, emacs, web browsers, etc. But curiously, I
don't see them in "grep" results (although I do see them if I pipe grep
results to a logfile of their own.) This makes me think that the computer
knows they are funny characters in some way, which gives me some hope that
they are filterable.

Can anyone suggest a way to do this? I'm working on HP-UX (though soon
it'll be Solaris. soon soon soon!)

And if anyone already has experience using Expect to talk to their OPAC,
especially Innopac, I'd love to hear about it. Looks like it could have
some complexities, and I don't really want to reinvent the wheel if the
wheel already exists.

Thanks,
Ken

Ken Irwin                                               kirwin_at_wittenberg.edu
Reference/Electronic Resources Librarian        (937) 327-7594
Thomas Library, Wittenberg University
Received on Thu Apr 29 2004 - 11:48:40 EDT