Snoopy.class.php使用手册
阅读原文时间:2023年07月15日阅读:1

Snoopy - the PHP net client v1.2.4

Snoopy是一个php类,用来模拟浏览器的功能,可以获取网页内容,发送表单。
Snoopy的特点:
1、抓取网页的内容 fetch
2、抓取网页的文本内容 (去除HTML标签) fetchtext
3、抓取网页的链接,表单 fetchlinks fetchform
4、支持代理主机
5、支持基本的用户名/密码验证
6、支持设置 user_agent, referer(来路), cookies 和 header content(头文件)
7、支持浏览器重定向,并能控制重定向深度
8、能把网页中的链接扩展成高质量的url(默认)
9、提交数据并且获取返回值
10、支持跟踪HTML框架
11、支持重定向的时候传递cookies
要求php4以上就可以了,由于本身是php一个类,无需扩支持,服务器不支持curl时候的最好选择。

概要方法:

include "Snoopy.class.php";  
$snoopy = new Snoopy;

$snoopy->fetchtext("http://www.php.net/");  
print $snoopy->results;

$snoopy->fetchlinks("http://www.phpbuilder.com/");  
print $snoopy->results;

$submit\_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";

$submit\_vars\["q"\] = "amiga";  
$submit\_vars\["submit"\] = "Search!";  
$submit\_vars\["searchhost"\] = "Altavista";

$snoopy->submit($submit\_url,$submit\_vars);  
print $snoopy->results;

$snoopy->maxframes=5;  
$snoopy->fetch("http://www.ispi.net/");  
echo "<PRE>\\n";  
echo htmlentities($snoopy->results\[0\]);  
echo htmlentities($snoopy->results\[1\]);  
echo htmlentities($snoopy->results\[2\]);  
echo "</PRE>\\n";

$snoopy->fetchform("http://www.altavista.com");  
print $snoopy->results;&nbsp;

类方法说明:

fetch($URI)  
-----------

This is the method used for fetching the contents of a web page.  
$URI is the fully qualified URL of the page to fetch.  
The results of the fetch are stored in $this->results.  
If you are fetching frames, then $this->results  
contains each frame fetched in an array.

fetchtext($URI)  
---------------    

This behaves exactly like fetch() except that it only returns  
the text from the page, stripping out html tags and other  
irrelevant data.        

fetchform($URI)  
---------------    

This behaves exactly like fetch() except that it only returns  
the form elements from the page, stripping out html tags and other  
irrelevant data.        

fetchlinks($URI)  
----------------

This behaves exactly like fetch() except that it only returns  
the links from the page. By default, relative links are  
converted to their fully qualified URL form.

submit($URI,$formvars)  
----------------------

This submits a form to the specified $URI. $formvars is an  
array of the form variables to pass.

submittext($URI,$formvars)  
--------------------------

This behaves exactly like submit() except that it only returns  
the text from the page, stripping out html tags and other  
irrelevant data.        

submitlinks($URI)  
----------------

This behaves exactly like submit() except that it only returns  
the links from the page. By default, relative links are  
converted to their fully qualified URL form.

类 VARIABLES: (default value in parenthesis)

$host            the host to connect to  
$port            the port to connect to  
$proxy\_host        the proxy host to use, if any  
$proxy\_port        the proxy port to use, if any  
$agent            the user agent to masqerade as (Snoopy v0.1)  
$referer        referer information to pass, if any  
$cookies        cookies to pass if any  
$rawheaders        other header info to pass, if any  
$maxredirs        maximum redirects to allow. 0=none allowed. (5)  
$offsiteok        whether or not to allow redirects off-site. (true)  
$expandlinks    whether or not to expand links to fully qualified URLs (true)  
$user            authentication username, if any  
$pass            authentication password, if any  
$accept            http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, )  
$error            where errors are sent, if any  
$response\_code    responde code returned from server  
$headers        headers returned from server  
$maxlength        max return data length  
$read\_timeout    timeout on read operations (requires PHP 4 Beta 4+)  
                set to 0 to disallow timeouts  
$timed\_out        true if a read operation timed out (requires PHP 4 Beta 4+)  
$maxframes        number of frames we will follow  
$status            http status of fetch  
$temp\_dir        temp directory that the webserver can write to. (/tmp)  
$curl\_path        system path to cURL binary, set to false if none

EXample:

Example:     fetch a web page and display the return headers and  
            the contents of the page (html-escaped):

include "Snoopy.class.php";  
$snoopy = new Snoopy;

$snoopy->user = "joe";  
$snoopy->pass = "bloe";

if($snoopy->fetch("http://www.slashdot.org/"))  
{  
    echo "response code: ".$snoopy->response\_code."<br>\\n";  
    while(list($key,$val) = each($snoopy->headers))  
        echo $key.": ".$val."<br>\\n";  
    echo "<p>\\n";

    echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\\n";  
}  
else  
    echo "error fetching document: ".$snoopy->error."\\n";

Example:    submit a form and print out the result headers  
            and html-escaped page:

include "Snoopy.class.php";  
$snoopy = new Snoopy;

$submit\_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";

$submit\_vars\["q"\] = "amiga";  
$submit\_vars\["submit"\] = "Search!";  
$submit\_vars\["searchhost"\] = "Altavista";

if($snoopy->submit($submit\_url,$submit\_vars))  
{  
    while(list($key,$val) = each($snoopy->headers))  
        echo $key.": ".$val."<br>\\n";  
    echo "<p>\\n";

    echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\\n";  
}  
else  
    echo "error fetching document: ".$snoopy->error."\\n";

Example:    showing functionality of all the variables:

include "Snoopy.class.php";  
$snoopy = new Snoopy;

$snoopy->proxy\_host = "my.proxy.host";  
$snoopy->proxy\_port = "8080";

$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";  
$snoopy->referer = "http://www.microsnot.com/";

$snoopy->cookies\["SessionID"\] = 238472834723489l;  
$snoopy->cookies\["favoriteColor"\] = "RED";

$snoopy->rawheaders\["Pragma"\] = "no-cache";

$snoopy->maxredirs = 2;  
$snoopy->offsiteok = false;  
$snoopy->expandlinks = false;

$snoopy->user = "joe";  
$snoopy->pass = "bloe";

if($snoopy->fetchtext("http://www.phpbuilder.com"))  
{  
    while(list($key,$val) = each($snoopy->headers))  
        echo $key.": ".$val."<br>\\n";  
    echo "<p>\\n";

    echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\\n";  
}  
else  
    echo "error fetching document: ".$snoopy->error."\\n";

Example:     fetched framed content and display the results

include "Snoopy.class.php";  
$snoopy = new Snoopy;

$snoopy->maxframes = 5;

if($snoopy->fetch("http://www.ispi.net/"))  
{  
    echo "<PRE>".htmlspecialchars($snoopy->results\[0\])."</PRE>\\n";  
    echo "<PRE>".htmlspecialchars($snoopy->results\[1\])."</PRE>\\n";  
    echo "<PRE>".htmlspecialchars($snoopy->results\[2\])."</PRE>\\n";  
}  
else  
    echo "error fetching document: ".$snoopy->error."\\n";

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器

你可能感兴趣的文章