您好,欢迎来到三六零分类信息网!老站,搜索引擎当天收录,欢迎发信息

新浪科技文章采集代码

2024/3/27 13:53:59发布14次查看
新浪科技的文章一键采集thinkphp适用代码
/* 新浪科技文章采集 */
public function sina_tech() {
/* need caull page num */
$page_num = intval($_post['get_post_page_num']);
if (empty($page_num)) $page_num = 1;
/* first count */
$post_count_a = m('post')->count();
/* for cull */
for ($page = 1; $page
$fullpage = curlgetpage('http://roll.tech.sina.com.cn/s/channel.php?ch=05#col=30&spec=&type=&ch=05&k=&offset_page=0&offset_num=0&num=5&asc=&page='.$page);
preg_match_all('/
\s+(.*)\s+/us', $fullpage, $match);
$fullpage = iconv(gb2312, utf-8, $match[1][0]);//echo $data1;die;
preg_match_all('/(.*)/isu', $fullpage, $in_li_tags);
foreach (array_unique($in_li_tags[1]) as $row) {
/* title */
preg_match_all('/(.*)/', $row, $title);
$title = $title[1][0];
/* link */
preg_match_all('/href=([^]*)/', $row, $link);
$link = $link[1][0];
/* date */
preg_match_all('/(.*)/i', $row, $date);
$date = date(y-, time()) . $date[1][0] . ':00';
// echo $title.' '.$link.' '.$date.'
';
/* going the post page */
$fullpage_post = curlgetpage($link);
/* fix tags */
$fullpage_post = preg_replace('/(.*)/isu', '${1}', $fullpage_post);
$fullpage_post = preg_replace('/(.*)/us', '', $fullpage_post);
//echo htmlspecialchars($fullpage_post);die;
/* post content */
preg_match_all('/\s+(.*)\s+/us', $fullpage_post, $post_content);
/* del a tags */
$post_content = preg_replace(/]*>(.*)/isu, '${1}', $post_content[1][0]);
// echo ''.$title.''.$url.'
'.$date.'
'.$postcon.'';
/* save to db */
$post_title_count = m('post')->where(title='$title')->count();
if ($post_title_count == 0) {
$datamysql[title] = $title;
$datamysql[content] = $post_content;
$datamysql[datetime] = $date;
m('post')->add($datamysql);
}
}
}
/* last count */
$post_count_b = m('post')->count();
$post_add_num = $post_count_b - $post_count_a;
/* callback */
if ($post_count_a == $post_count_b) {
echo '{success:1,msg:文章数无变化}';
} else {
echo '{success:1,msg:成功采集 ' . $post_add_num . ' 篇文章}';
}
} ad:真正免费,域名+虚机+企业邮箱=0元
该用户其它信息

VIP推荐

免费发布信息,免费发布B2B信息网站平台 - 三六零分类信息网 沪ICP备09012988号-2
企业名录 Product