Home > software > Realtime Signal Analysis in Perl

Realtime Signal Analysis in Perl

September 24th, 2011 Leave a comment Go to comments

About a month ago, we were working on the Fluffy Ball project – a computer input device that can react to fondling and punching. Thanks to a nice idea on the brmlab mailing list, we use a microphone and process the noise coming from the ball’s scratchy stuffing and an embedded jingle. The sounds from the outside are almost entirely dampened by the stuffing and for a human, the noise of fondling and punching is easily distinguishable.

Frequency spectrum, for our purposes, is just an array indexed by frequency, storing the amplitude of each frequency (in some range). A common variation is the power spectrum that describes the power of each frequency, i.e. the amplitude squared. Frequency spectrum is obtained by splitting the input signal to fixed-size samples and performing Discrete-Time Fourier Transform.

It turns out that trivial spectrum-based rules can be used to achieve reasonably high detection accuracy for a computer too (especially when the user is allowed to “train” her input based on feedback); I had big plans to use ANN and all the nifty things I have learned in our AI classes, but it turned out to be simply an overkill. The input signal is transformed to a frequency spectrum (see box) using real discrete FFT.

So, we have the audio signal coming in from a regular mic device and need to process it further. I chose Perl for quick prototyping and I have assumed that I would find some pre-made scaffolding for this ready. But it turns out that noone really published a simple example of even just showing a real-time frequency spectrum. So, here you go! :-)

First, we need some reasonable way to continuously display the spectrum. Most GUI paradigms are event-driven, but events are usually user interaction pieces and while it would be possible to incorporate continuous data-based updates in this model, it feels quite backwards. So we use a trick:

use warnings;
use strict;
init_dsp(); init_fft();
use Tk;
our $mw = MainWindow->new;
$mw->after(1, \&ticks); # after 1ms, give control back
sub ticks {
	while (1) {

This circumvents the event-driven architecture of Tk and instead puts our main loop in control, processing any GUI events when it’s good time. For more complex programs, this is a bad idea and it will lead to poorly maintaineable code, but when writing simple tools, you should not succumb to grand frameworks and let your code overgrow you.

Okay, how to grab audio input signal in Perl? Unfortunately, there are not really any handy modules you could use thoughtlessly. Audio::DSP is a possibility, but using it is clumsy, especially in the current world of ALSA as you have to rely on the imperfect aoss wrapper. A simple alternative is to get the raw byte data through a pipeline from the ALSA arecord tool:

our ($devname, $fmt, $bitrate, $wps, $bps, $bufsize, $dsp);
	$devname = "default"; # or e.g. hw:1,0 for an additional USB soundcard input
	$fmt = 16;            # sample format (bits per sample)
	$bitrate = 16384;     # sample rate (number of samples per second)
	$wps = 8;             # FFT windows per second (rate of FFT updates)
	$bps = ($fmt * $bitrate) / 8; # bytes per second
	$bufsize = $bps / $wps;   # window buffer size in bytes
sub init_dsp {
	open ($dsp, '-|', 'arecord', '-D', $devname, '-t', 'raw',
		'-r', $bitrate, '-f', 'S'.$fmt) or die "arecord: $!";
	use IO::Handle;
sub read_dsp {
	my $w;
	read $dsp, $w, $bufsize or die "read: $!";
	return $w;

read_dsp will return one signal window per call, the window being a binary blob consisting of one two-byte word per sample. We want to magically convert this to a spectrogram.

Audio::Analyze is again the simple way to get a signal spectrum. If you are after analyzing a pure audio signal, you probably want to use it since it can easily filter the signal based on relative human perception of frequencies etc. But for us, it is inconvenient to feed it data through a pipe and we will directly use Math::FFT. It will still handle all the gory math for our case (and we care about the actual noise, not the way people would hear it).

use Math::FFT;
use List::Util qw(sum);
our @freqs;
sub init_fft {
	my $dft_size = $bitrate / $wps;
	for (my $i = 0; $i < $dft_size / 2; $i++) {
		$freqs[$i] = $i / $dft_size * $bitrate;
sub process_signal {
	my ($bytes) = @_;
	# Convert raw bytes to a list of numerical values.
	$fmt == 16 or die "unsupported $fmt bits per sample\n";
	my @samples;
	while (length($bytes) > 0) {
		my $sample = unpack('s<', substr($bytes, 0, 2, ''));
		push(@samples, $sample);
	# Perform RDFT
	my $fft = Math::FFT->new(\@samples);
	my $coeff = $fft->rdft;
	# The output are complex numbers describing the exactly phased
	# sin/cos waves. By taking an abs value of the complex numbers,
	# we just measure the amplitude of a wave for each frequency.
	my @mag;
	$mag[0] = sqrt($coeff->[0]**2);
	for (my $k = 1; $k < @$coeff / 2; $k++) {
		$mag[$k] = sqrt(($coeff->[$k * 2] ** 2)
		                + ($coeff->[$k * 2 + 1] ** 2));
	# Rescale to 0..1. Many fancy strategies are possible, this is
	# extremely silly.
	my $avgmag = sum (@mag) / @mag;
	@mag = map { $_ / $avgmag * 0.3 } @mag;
	return @mag;

Not much to add besides the inline comments. The input of the process_signal function is a raw byte stream, the output is a list of amplitudes; @freqs maps the list indices to the actual Hz frequencies. The normalization to [0,1] interval shown here (pitching the mean at 0.3) is extremely naive, again there are many possible strategies. Also, you certainly want to use a window function etc. in more serious applications.

Now, for the visualization. We have chosen Tk for our GUI (it looks ugly, but it is reasonably easy to use despite its Tcl antics). We will use its Canvas object where we can draw freely, and just plot a line for each frequency:

our $canvas;
sub render_signal {
	# Display parameters, tweak to taste:
	my $rows = 2;
	my $hspace = 20;
	my $height = 150;
	my $vspace = 20;
	my @spectrum = @_;
	my $row_freqn = @spectrum / $rows;
	unless ($canvas) {
		$canvas = $mw->Canvas(
			-width => $row_freqn + $hspace * 2,
			-height => $height * $rows + $vspace * ($rows + 1));
	for my $y (0..($rows-1)) {
		for my $x (0..($row_freqn-1)) {
			my $hb = ($height + $vspace) * ($y + 1);
			my $i = $row_freqn * $y + $x;
			# Draw line:
			my $ampl = $spectrum[$i];
			$ampl <= 1.0 or $ampl = 1.0;
			my $bar = $height * $ampl;
			$canvas->createLine($x + $hspace, $hb,
			                    $x + $hspace, $hb - $bar);
			# Draw label:
			if (!($x % ($row_freqn/4))) {
				$canvas->createLine($x + $hspace, $hb + 0,
				                    $x + $hspace, $hb + 5,
				                    -fill => 'blue');
				$canvas->createText($x + $hspace, $hb + 15,
				                    -fill => 'blue',
				                    -font => 'small',
				                    -text => $freqs[$i]);

This suffices for a naive visualization, you can easily tweak it to do thresholding and whatever else you desire. I have found that on some of my computers, the X protocol is pushed to its limits by repeatedly drawing a large amount of lines, and sometimes the spectrum will start to lag behind the signal; either show wider bars averaging together multiple frequencies, or use something other than a Canvas object – raw pixmap transfer would likely be better than such a large amount of line drawing operations.

For a serious signal analysis work, you will also want a spectrogram – a time-based plot of amplitude of various frequencies.

To get a working script skeleton, simply piece the code snippets together (fb-simple.pl). See fb.pl for the real fluffy ball script. It is much uglier, but it maintains sample averages over longer time windows (essential for more complex signal analysis), it has simple sample recording capabilities, and an example of naive threshold-based classifier.

Categories: software Tags: , , , ,
  1. No comments yet.
  1. No trackbacks yet.

− four = 4