Now for once, this is a really neat use of technology, someone using their brains and a suitable tech to solve a problem that is very apparent.
PERL may be frowned upon by some as being old or outdated, but seriously for parsing data, pattern matching and trawling, it’s still excellent and you can get a program up and running very fast, especially with the CPAN module system.
he computer crimes unit of New York’s Suffolk County Police Department sits in a gloomy government office canopied by water-stained ceiling tiles and stuffed with battered Dell desktops. A mix of file folders, notes, mug shots and printouts form a loose topsoil on the desks, which jostle shoulder-to-shoulder for space on the scuffed and dented floor.
I’ve been invited here to witness the endgame of a police investigation that grew from 1,000 lines of computer code I wrote and executed some five months earlier. The automated script searched MySpace’s 100 million-plus profiles for registered sex offenders — and soon found one that was back on the prowl for seriously underage boys.
Of course some manual monkey work still needed to be done to verify any profiles, but still from 100 million down to a handful, pretty neat eh?
The code swept in a vast number of false or unverifiable matches. Working part time for several months, I sifted the data and manually compared photographs, ages and other data, until enhanced privacy features MySpace launched in June began frustrating the analysis.
Excluding a handful of obvious fakes, I confirmed 744 sex offenders with MySpace profiles, after an examination of about a third of the data. Of those, 497 are registered for sex crimes against children. In this group, six of them are listed as repeat offenders, though Lubrano’s previous convictions were not in the registry, so this number may be low. At least 243 of the 497 have convictions in 2000 or later.
I for one am impressed.