steamsprocket.org.uk

Stop amavisd-new wrapping long lines in syslog

Update: Newer versions of amavisd-new (certainly 2.6.4-3) have a variable called $logline_maxlen which does exactly what it says on the tin. The rest of this post is therefore of only historic interest.

I administer a mail server that uses the popular amavisd-new to perform virus scanning, and spam filtering using SpamAssassin. That server uses Logcheck to send mail notifications if any unusual messages are logged – it works by filtering out any log messages fitting a set of patterns considered unimportant, and reporting any that are left. One small annoyance with amavis is that it splits lines that are longer than a certain magic number, so you get syslog entries like this:

Feb 23 15:15:23 hostname amavis[3139]: (03139-11) Imagine this message is long en....
Feb 23 15:15:23 hostname amavis[3139]: (03139-11) ...ough to be split

The split means that neither line fits the pattern of a message which can be ignored. Since you can’t predict how long the message will be, it’s not possible to write an ignore rule to take this splitting into account.

Extensive Googling revealed no mention of this problem anywhere, and eventually I decided I would have to Use The Source. The business part of amavis is the script /usr/sbin/amavisd-new – at least that’s where the Debian package puts it – which includes the following section1:

1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
    my($pre) = $alert_mark;
    my($logline_size) = 980;  # less than  (1023 - prefix)
    while (length($am_id)+length($pre)+length($errmsg) > $logline_size) {
      my($avail) = $logline_size - length($am_id . $pre . "...");
      $log_lines++; $! = 0;
      syslog($prio, "%s", $am_id . $pre . substr($errmsg,0,$avail) . "...");
      if ($! != 0) { $log_warnings++; $log_status_counts{"$!"}++ }
      $pre = $alert_mark . "...";  $errmsg = substr($errmsg, $avail);
    }
    $log_lines++; $! = 0; syslog($prio, "%s", $am_id . $pre . $errmsg);

As we can see, there’s a hardcoded constant which determines how long a line may be. Removing or commenting out the undesired section leaves us with

1876
1877
    my($pre) = $alert_mark;
    $log_lines++; $! = 0; syslog($prio, "%s", $am_id . $pre . $errmsg);

Another option might be to increase the value of $logline_size to something that means you rarely if ever exceed it.

What could possibly go wrong?

Generally it’s a very bad idea to start messing with files that are managed by the distribution’s packaging system – for one thing, it’ll be overwritten the next time there’s an update. Additionally, there must presumably have been a reason for the line splitting in the first place, and hence a reason not to disable it. This is a dirty hack, and I don’t recommend doing it. I’m only even posting it so that in the future there might be some record that the issue exists.

That said, so far it’s working as desired and there have been no side effects.

  1. as of version 1:2.6.1.dfsg-1 []

POSIX file semantics in Windows

Some Background:

BTW: In the context of this post, ‘Windows’ always means ‘Windows NT’.

The Windows kernel is designed to support multiple independent subsystems – different application environments – atop its native API. The two most common examples of Windows subsystems are the Windows API – formerly known as Win32 and used by the overwhelming majority of Windows programs, and the POSIX subsystem1 provided by SFU/SUA/Interix2- henceforth known as Interix.

Note: Just to muddy the waters, Cygwin provides a POSIX environment layered on top of the Windows API.
This has nothing to do with the POSIX subsystem.

The Problem

The Windows API is – by default – case preserving, but not case sensitive, everywhere. On the other hand, NTFS (the standard Windows filesystem) is POSIX compliant – presumably because the POSIX subsystem would have been unable to claim compliance otherwise. This means among other things that it supports case sensitivity – with the result that it’s possible to create two files which differ only in case, mightily confusing most Windows software. Even worse, POSIX allows file names to contain characters that the Windows API does not. Now you don’t need two similarly named files to create confusion; just having a single file named, say, ‘How Soon is Now?.flac’ will cause a problem.

So What?

Normally this wouldn’t be an issue. Coming from Unix it seems pretty annoying, but livable. A problem arises however when a filesystem is shared between Windows and an OS which will happily create files using the full range of names allowed by POSIX (in practice this means most operating systems other than Windows) .

Say some unsuspecting user (cough) is using NTFS-3G to access NTFS filesystems from Linux; by default it doesn’t restrict the filenames allowed to the Windows-safe subset.

Well, I have3 a lot of music filed away such that the file is named after the title of the track, and a handful of tracks include characters which are verboten under Windows – most commonly a colon or a question mark. It took quite a while to figure out why a lot of my music seemed to be mysteriously inaccessible from Windows – and wouldn’t it be nice if there were some way to get to them, even if just to rename them? Well apparently “all the files are accessible if SFU is installed on Windows”. There’s only one problem with that: Interix/SFU isn’t available for the version of Windows I’m using. Sigh

What Next?

Clearly Windows itself is capable of dealing with the full range of filenames, otherwise Interix wouldn’t be able to use them. Is there any way for the Windows API to access this capability? Investigations reveal the existance of a promising-sounding flag to CreateFile: FILE_FLAG_POSIX_SEMANTICS. This flag’s description doesn’t agree with its name – the description only talks about case sensitivity. Is that an oversight in the description, or a poor choice of name? Further investigations require a small detour…

At some point Microsoft decided that the ability to have files whose names differ only in case could be dangerously confusing for some of the extant Windows software (think antiviruses trying to scan MALWARE.dll vs. malware.dll), so case sensitivity is disabled by default across the board. It can be enabled for subsystems other than the Windows API, and for the Windows API using FILE_FLAG_POSIX_SEMANTICS, by setting a registry key:

Under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\kernel, there needs to be a DWORD called ObCaseInsensitive, with the value ‘0’.

…having set that key and rebooted, the FILE_FLAG_POSIX_SEMANTICS flag should now have an effect. A quick Python test confirms this:

f1=win32file.CreateFile('foo', win32file.GENERIC_WRITE, 0, None, win32file.CREATE_NEW, win32file.FILE_FLAG_POSIX_SEMANTICS, None)
f2=win32file.CreateFile('FOO', win32file.GENERIC_WRITE, 0, None, win32file.CREATE_NEW, win32file.FILE_FLAG_POSIX_SEMANTICS, None)

Both files are created successfully, and can be seen in Windows Explorer, though it can’t differentiate between them.

So, now we get to find out whether that flag really does allow POSIX semantics as its name suggests, or if the description is correct and all this was for nothing:

f3=win32file.CreateFile('foo?', win32file.GENERIC_WRITE, 0, None, win32file.CREATE_NEW, win32file.FILE_FLAG_POSIX_SEMANTICS, None)

Result: error: (123, 'CreateFile', 'The filename, directory name, or volume label syntax is incorrect.')

Damn.

Update:

Cygwin 1.7 claims to support case sensitivity with the appropriate registry key set, and also the full range of characters in filenames. Since Cygwin does use the Native API for some of its operations, I dared to hope that perhaps they’d performed some black magic to get this to work properly. Sadly, attempting to read an existing file with a question mark in the name leads to the response ‘cannot access <filename>: No such file or directory‘, so I guess they’re just faking it.

  1. Supposedly created because POSIX compliance was a checklist requirement for US government software purchases []
  2. Originally Windows came with a basic POSIX subsystem out of the box, but that died with the release of Windows XP, replaced by Interix []
  3. Or rather ‘had’ – it’s all been renamed for Windows-safety now []