当前位置:WooYun(白帽子技术社区) >> php >> 请问大牛们 有什么办法能采集百度下拉框的数据?
-
-
-
-
-
-
-
7# VIP (Fatal error: Call to undefined function getwb() in /data1/www/htdocs/106/wzone/1/index.php on line 10|@齐迹@小胖子@z7y@nauscript|昨晚做梦梦见了一个ecshop注射0day,醒来后忘记在哪了。|预留广告位)
| 2013-02-26 17:29
终于成功了,百度的返回数据不符合php的json处理格式,需要进行一番处理。。
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<link type="text/css" rel="stylesheet"
href="http://zone.wooyun.org/themes/wooyun/css/style.css"/></head>
<body>
<?php
/*
another:VIP
date:2013-2-26
*/
$word=$_GET['word'];
if ($word=="")
{
echo <<<EOF
<form action="" method="get">
<p>关键词: <input type="text" name="word" /></p>
<input type="submit" value="采集" />
</form>
EOF;
}
else
{
$data=file_get_contents('http://suggestion.baidu.com/su?wd='.$word);
$data=mb_convert_encoding($data, 'UTF-8', 'UTF-8,GBK,GB2312,BIG5' );
$data_temp=strpos($data,"x");
$data=substr_replace($data,"",$data_temp,17);
$data = trim($data,");");
$data = trim($data,"{");
$data=preg_replace("/q:.+?.e,/",'', $data);
$data = str_replace("[","",$data);
$data = str_replace("]","",$data);
$data = "[".$data."]";
$data = str_replace(",","},s:",$data);
$data = str_replace("s:","{\"s\":",$data);//复杂的处理,以符合json格式
$dc=json_decode($data);
for ($n=0; $n<=9; $n++)
{
$wd[$n]=$dc[$n]->s;
echo "</br>".$wd[$n];
}
}
?>
</body>
</html>
代码还可以写得更简单。
示例:http://email.smtp.yupage.com/baidu_.php
可以采集指定关键词的下拉框联想结果(10个) -
-
-
-
-
-
-