[ arsa xx @ 07.01.2004. 23:34 ] @
Na http://phpcrawl.cuab.de/ sam nasao dobar php spider. Medjutim ima par bagova. Neke sam otklonio a neke ne. Ako neko zna bolji ili moze da pomogne, ovako stoje stvari. Evo primera gde se javlja bug. Prvo da spomenem da se pronalazenje linkova vrsi sa: Code: preg_match_all("/((?i)href)[ ]{0,}=[ ]{0,}(\"|'){0,1}[^\"'><\n ]{0,}(\"|'|>|<|\n| )/",$source,$regs); Pri dnu listinga se javljaju nepostojeci linkovi tj postojeci ali "odseceni". Jedan od njih je http://localhost/!math/prodajn (drugi od dozdo) Code: Page requested: http://localhost/!math Status: HTTP/1.1 301 Moved Permanently Referer-page: Content received: 305 bytes Page requested: http://localhost/!math/ Status: HTTP/1.1 200 OK Referer-page: http://localhost/!math Content received: 33415 bytes Page requested: http://localhost/!math/index.php Status: HTTP/1.1 200 OK Referer-page: http://localhost/!math/ Content received: 33415 bytes Page requested: http://localhost/!math/prikaz.php?kat=osnovna_skola Status: HTTP/1.1 200 OK Referer-page: http://localhost/!math/ Content received: 39372 bytes Page requested: http://localhost/!math/prikaz.php?kat=srednja_skola Status: HTTP/1.1 200 OK Referer-page: http://localhost/!math/ Content received: 44831 bytes ....... .... .. Page requested: http://localhost/!math/prodajn Status: HTTP/1.1 404 Not Found Referer-page: http://localhost/!math/info.php?id=26 Content received: 285 bytes Page requested: http://localhost/!math/prodajna_mesta.php Status: HTTP/1.1 200 OK Referer-page: http://localhost/!math/prodajna_mesta. Content received: 26296 bytes Summery: Links followed: 67 Files received: 67 Bytes received: 1600781 |