Davriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-22 天前The AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comexternal-linkmessage-square235linkfedilinkarrow-up1856arrow-down18
arrow-up1848arrow-down1external-linkThe AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comDavriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-22 天前message-square235linkfedilink
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up4arrow-down3·edit-21 天前They do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping I believe a lot of websites don’t want both though
minus-squarethreeganzi@sh.itjust.workslinkfedilinkEnglisharrow-up2·1 天前Does it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up1·24 小时前I assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on Some selenium stuff
They do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping
I believe a lot of websites don’t want both though
Does it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
I assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on
Some selenium stuff