smutcraft uses a simple perl harvesting script to scan a table of sites to scan, followed by querying them with LWP and nmap; then stuffing them into a PostgreSQL. The site itself is implemented with HTML::Mason, a great content management/embedded Perl/caching tool.
Scans occur once per day, with sites whose last stats are over a week ago being scanned. Sites that have not responded in more than a month are dropped. I may push these numbers out as the number of sites grows.
Suggestions I should be using PHP/MySQL/your religion here can be sent to the bit bucket, where they'll at least enhance random number generators with more entropy.
Copyright © 2024-2003 Rodger Donaldson, except smutcraft logo, which is copyright © 2024 Alan Bauchop.