# Robots.txt file for newkerala.com # Last updated: 2025-10-01 ########################################### # Sitemap for all crawlers Sitemap: https://www.newkerala.com/sitemap.xml # ----------------------------------------- # SEARCH ENGINE CRAWLERS - FULL ACCESS # ----------------------------------------- User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-News User-agent: Googlebot-Video User-agent: Mediapartners-Google User-agent: AdsBot-Google Allow: / Crawl-delay: 2 User-agent: Bingbot User-agent: BingPreview User-agent: MSNBot Allow: / Crawl-delay: 2 User-agent: DuckDuckBot Allow: / Crawl-delay: 3 User-agent: Applebot Allow: / Crawl-delay: 3 # ----------------------------------------- # SOCIAL MEDIA CRAWLERS - ALLOW FOR SHARING # ----------------------------------------- User-agent: facebookexternalhit User-agent: Twitterbot User-agent: LinkedInBot User-agent: PinterestBot Allow: / Crawl-delay: 4 # ----------------------------------------- # TRUSTED AI AND RESEARCH CRAWLERS # ----------------------------------------- # Whitelisted to safely reference content without overloading server User-agent: ChatGPTbot User-agent: ClaudeBot User-agent: PerplexityBot Allow: / Crawl-delay: 10 Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /junk/ Disallow: /classic/ Disallow: /support/ Disallow: /cron/ Disallow: /admin/ Disallow: /includes/ Disallow: /logs/ Disallow: /api/ Disallow: /stats/ Disallow: /tracking/ Disallow: /private/ Disallow: /devo/nm/ # ----------------------------------------- # BLOCK ABUSIVE AI AND SCRAPING BOTS # ----------------------------------------- User-agent: AhrefsBot User-agent: SemrushBot User-agent: MJ12bot User-agent: DotBot User-agent: BLEXBot User-agent: ScreamingFrog User-agent: CCBot User-agent: Exabot User-agent: HTTrack User-agent: Python-requests User-agent: Wget User-agent: rogerbot User-agent: any other known scrapers Disallow: / # ----------------------------------------- # CRAWL DELAY FOR UNKNOWN BOTS # ----------------------------------------- User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /junk/ Disallow: /classic/ Disallow: /support/ Disallow: /cron/ Disallow: /admin/ Disallow: /includes/ Disallow: /logs/ Disallow: /api/ Disallow: /stats/ Disallow: /tracking/ Disallow: /private/ Disallow: /devo/nm/ Disallow: /news/o/ Crawl-delay: 10 Sitemap: https://www.newkerala.com/world-time-now/sitemap.xml