php 截取一个网页的所有超链接的代码 急急

2025-05-14 20:00:36
推荐回答(3个)
回答(1):

header("Content-type: text/html; charset=utf-8");
if(!empty($_POST['input_text'])) {
ini_set('default_socket_timeout', 60); //php file_get_contents超时控制
if(!$data = file_get_contents($_POST['input_text'])) {
echo "Time out!";
return false;
}else{
$charset_pos = stripos($data,'charset');
if($charset_pos) { //页面数据编码格式转换
if(stripos($data,'utf-8',$charset_pos)) {
$data = iconv('utf-8','utf-8',$data);
}else if(stripos($data,'gb2312',$charset_pos)) {
$data = iconv('gb2312','utf-8',$data);
}else if(stripos($data,'gbk',$charset_pos)) {
$data = iconv('gbk','utf-8',$data);
}
}
}

//获取超链接核心代码

$pattern = '/]*?)>([^<]*?)<\/a>/i';
preg_match_all($pattern, $data, $links);

$links[2]为全部链接。

$br = 5;
echo "

";
//$links[2]为所有超链接组成的数组。
foreach($links[0] as $count => $link){
if($count!=0 && $count%$br == 0) echo "";
echo "";
}
echo "
".$link."
";
die;
}else {
?>


Get Web Page











}

//End_php

回答(2):

function get_links($content) {
$pattern = '/(.*?)<\/a>/i';
preg_match_all($pattern, $content, $m);
return $m;
}

回答(3):

远程截取 就用curl_exec 或者 fopen

这个方法可取:
function get_links($content) { $pattern = '/(.*?)<\/a>/i'; preg_match_all($pattern, $content, $m); return $m; }