Re: Web archiving and WARC

From: raffaele messuti <raffaele.messuti_at_nyob>
Date: Tue, 29 Nov 2011 09:16:35 +0100
To: CODE4LIB_at_LISTSERV.ND.EDU
On Thu, Nov 24, 2011 at 12:30 AM, Edward M. Corrado
<ecorrado_at_ecorrado.us> wrote:
> I did find a version of wget with warc support built in [1] from the
> Archive Team so that may be my solution, but compile software with
> "dirty" written into the name of the zip file is maybe not the best
> longterm solution. Does anyone know of any other simples tool to
> create a WARC file (either from harvesting or converting a wget or
> similar mirror/archive)?

for me it's safe to begin with wget-warc[1]
patches made by archiveteam are pushed into wget sources[2]
so, in a while, i think that will be available to stable release

use this script to compile it
https://github.com/ArchiveTeam/splinder-grab/blob/master/get-wget-warc.sh

ciao.

[1] https://github.com/alard/wget-warc
[2] http://bzr.savannah.gnu.org/lh/wget/trunk/revision/2571


--
raffaele messuti <raffaele.messuti_at_gmail.com>
@atomotic
Received on Tue Nov 29 2011 - 03:18:34 EST