This patch tightly integrates the MHonArc mail-to-HTML convertor (Earl Hood's description, not mine) with Mailman and its internal pipermail archiving code. The purpose of the patch is to produce a fusion of (hopefully) the best feature of pipermail and MHonArc for handling Mailman mailing list archives.
Although pipermail has a number of weaknesses it has some good features, including structuring mail archives by per-list configurable periods and supporting private and public archive access.
MHonArc is represented by its proponents as being superior to pipermail in handling attachments and multipart MIME messages, and being able to build/rebuild archives from large UNIX mboxes.
In contrast, pipermail's "database" creaks and groans with large mboxes and lists with high levels of traffic, and it collapses into a heap when it is asked to rebuild from very large mboxes. The latest pipermail handling of attachments and MIME is OK'ish but appears to be weaker than MHonArc.
With MHonArc, I could not quickly find an obvious way to get the HTML archives it is generating into optional Yearly/Monthly/Weekly/Daily periods as pipermail does. It is clearly possible (for instance, see http://www.mail-archive.com/mailman-users%40python.org/) but how ...
List archive privacy can be important for some user communities and the Mailman/pipermail solution works well.
I also wanted searchable archives and saw no reason not to continue using my Mailman/HTdig integration patch for per-list archive searching.
Finally, I thought it would be neat to make choosing whether MHonArc or pipermail should generate a list's archive pages a per-list configuration option and have Mailman's archive builder
$prefix/bin/arch work whichever choice was made for any list.
The upshot was this patch, which is a hack, but one which works. The code is not pretty but I defy even the best cosmetic surgeon to produce movie star looks when he is grafting a wart onto a boil. The code works and can fill my needs until Mailman version 3 comes along incorporating a wizzy new archiver and archive search capability ...
Alternative ways of using MHonArc and HTdig in conjunction with Mailman exist and doubtless some would argue are superior. There is no compulsion to use this patch; it is just another option you can choose from.
The implementation operates with pipermail in charge of archiving. That is, the pipermail code generates the top level archive TOC pages for each list, organises each list's archive directory structure and sorts out the archving period stuff. But when it comes to generating the message and index pages of the HTML archives the use of pipermail or MHonArc depends on the option set for any list via the Archiving Options page of the admin web GUI.
For lists set for MHonArc archiving, pipermail uses an instance of MHonArc instead of its own code to generate the HTML message and index pages. For such lists, pipermail maintains only a vestigial database so that the problems of large pipermail databases is avoided.
The organization of the archives on disk is pretty much the same for both pipermail'ed and MHonArc'ed lists. The top level list TOC is the same as for normal Mailman as is the per-period sub-directory structure and per-period text (mbox) archive. The naming of message files is different for MHonArc as is the storage of extracted attachment files. For extracted attachments pipermail uses a separate directory structure while MHonArc puts them in the same directory as the messages. Actually, this is a win for MHonArc because the URL's on message pages linking to extracted attachments that it generates are relative URL's. In contrast, pipermail generates absolute URLs. This means that, for pipermail generated message pages, if a list archive is changed from private to public or vice versa, the links on messages pages are wrong and can only be corrected by rebuilding the entire list archive. With MHonArc'ed archives list privacy changes are a non-event, with Mailman's creation and deletion of the symlink to the list archives in
$prefix/archives/public/ doing all that is needed.
$prefix/bin/arch works as normal, regardless of whether pipermail or MHonArc is generating message and index pages. Indeed, having changed the archiver option for a list,
Because I like the date/thread/subject/author indexes produced by pipermail, MHonArc (as used by this patch) does the same thing. The layout of message and index pages generated by MHonArc is controlled using three MHonArc resource configuration files (MRCFs), which must reside in Mailman's template directory structure. The MRCFs are selected for a list in the same way that any other template file is associated with that list, that is, the template hierarchy is searched. Before pipermail invokes MHonArc to handle a messages or a group of messages, it selects the which MRCFs apply and passes these as parameters to MHonArc. The default MRCFs reside in
subject.mrc. The look and feel the default MRCFs produce is similar to that of the archive and index pages for regular pipermail'ed archive pages. The only thing that is different with the operation of the templates for MHonArc is that there is no variable substitution performed by Mailman; instead some useful values (such as list name and archive name) are passed to MHonArc using environment variables which can be referred to in the MRCFs to characterize the pages being generated.
Whether the MRCFs are passed to MHonArc on the command line when it is invoked depends on:
MHONARC_SAVE_RESOURCESMM config variable (Default value is
True). This config variable tells pipermail to pass MHonArc either the
-nosaveresourcescommand line option. If
Truethen the MRCFs are not passed as command line options (exluding the situation in ), because MHonArc will already have the template information from the MRCFs in its database.
The downside to
True is that new messages for an existing archive period will continue to use the existing templates despite changes of/to the
subject.mrc applicable to the list concerned, even though
mailmancntrl -restart may have been run. To get the revised templates into use, run
bin/arch --wipe for the list.
Following installation of this patch, the availability of the features is provides is dependent on the installation of MHonArc (see Necessary Precursors below) and the setting the value of
MHONARC_ARCHIVER_PATH to a non-empty string.
MHonArc has be installed on the Mailman server. I found installing it into
$prefix/mhonarc worked for me. There is a new Mailman configuration variable, added to
$prefix/Defaults.py by the patch, which tells Mailman where MHonArc is installed. This was the sum total of setup I had to do for the MHonArc installation.
MHONARC_ARCHIVER_PATH = os.path.join(PREFIX, 'mhonarc', 'bin', 'mhonarc')
The patch adds an option, with radio buttons to choose pipermail or MHonArc as archiver for a list, to the Archiving Options page of the web admin GUI. This option is only displayed after a non blank value has been assigned to
MHONARC_ARCHIVER_PATH. The default value for which archiver to use is set by a new Mailman configuration variable added to
$prefix/Defaults.py by the patch:
# Which archiver to use by default to generate archive pages: # 0 - pipermail # 1 - mhonarc DEFAULT_WHICH_ARCHIVER = 0
When a new list is created and when/if archiving is enabled for it, it will use the archiver specified by
DEFAULT_WHICH_ARCHIVER at the time of the list's creation. Lists with existing archives that pre-date adoption of this patch will continue to be pipermail archives unless their choice of archiver is changed via the web admin GUI.
As noted above, when the archiver nominated for a given list is changed the change will not take effect until
With pipermail in charge, MHonArc only gets to see the archive a period at a time. In practice, each period of a list's archive is what MHonArc sees and it maintains a database and index pages for each of them quite separately. This means that thread and date links generated by MHonArc terminate at the boundary of the archiving period. The only index linking all of the period archives for a list is the tope level TOC page for a list's archives which is generated by pipermail in the normal way. Thus far, this characteristic has not been a problem for me and I've not given much thought to changing it.
$prefix/bin/arch is run on a list configured to use MHonArc, and because of the period archive approach, pipermail passes messages for the same period to MHonArc for processing in temporary mbox files. The way this is done means that the memory demand made by Mailman/pipermail in handling a large mbox is no bigger than the biggest message in the input file being processed. By contrast, when archiving a single message, for the ArchRunner say, it is passed via stdin to MHonArc.
My Mailman-HTdig integration patch works as normal with archives generated by both pipermail and MHonArc. Just install and use it in the same way as usual.
Mailman's pipermail features and configuration for obscuring mail address in the HTML archives to thwart email address harvesters are not automatically applied to MHonArc generated pages. Features provided by MHonArc can be used by making modifications to the MHonArc resource configuration files that control archive index and message page generation. You should take copies of the default MRCFs installed in
$prefix/templates/en, modify them and add them to a site, virtual host or list specific sub-directory of the template hierarchy.
The default MRCFs have embedded in them default indexing control directives for HTdig, in anticipation of HTdig being used for archive search. With pipermail generated HTML pages the effect of changing the value of Mailman config variables
$prefix/Mailman/mm_cfg.py is dynamically incorporated into the pages. With MHonArc generated pages this must be achieved by copying and modifying the default MRCFs installed in
$prefix/templates/en and adding them to a site, virtual host or list specific sub-directory of the template hierarchy.
This patch is applicable to Mailman MM 2.1.3 and later.
The following patch must be applied to Mailman before applying this patch:
DATA_FILE_VERSIONvalue is updated in $prefix/Mailman/Version.py
$prefixis the value of the Mailman
> install.me Checking dependencies: Fcntl ......................... ok File::Basename ................ ok Getopt::Long .................. ok Symbol ........................ ok Time::Local ................... ok Pathname of perl executable: ("/usr/bin/perl") Directory to install executables: ("/usr/bin") $prefix/mhonarc/bin Directory to install library files: ("/usr/lib/perl5/site_perl/5.6.1") $prefix/mhonarc/lib/perl5/site_perl/5.6.1 Directory to install documentation: ("/usr/doc/MHonArc") $prefix/mhonarc/doc/MHonArc Directory to install manpages: ("/usr/share/man") $prefix/mhonarc/share/man You have specified the following: Perl path: /usr/bin/perl Bin directory: $prefix/mhonarc/bin Lib directory: $prefix/mhonarc/lib/perl5/site_perl/5.6.1 Doc directory: $prefix/mhonarc/doc/MHonArc Man directory: $prefix/mhonarc/share/man Is this correct? ['y'] ...
See the Description and Implementation Details above.
Apply the patch from within the Mailman build directory using the command:
patch -p1 < path-to-patch-file
Uses the same patch as MM 2.1.7
|2.1.3-0.2||First 'official release'|
|2.1.3-0.1||Original 'unofficial release'|
|Click to e-mail comments or complaints||Last updated: 09/07/2009 13:29|