In this blog, we will study robots.txt, HTTPS, SSL/TLS certificate and their types in detail. So without any further delay, let’s dive deeply!
What is Robots.txt?
A robots.txt file is a file stored in the root directory of a website. It tells the search engine crawlers that which pages of the website should be crawled and which should not. The file is located in the root folder of a website. You can check the robots.txt file of any website by adding /robots.txt after the domain.
For example:
- Amazon’s robots.txt file: https://www.amazon.in/robots.txt
- Flipkart’s robots.txt file: https://www.flipkart.com/robots.txt
Free tools to check robots.txt files:
- Google Search Console (URL Inspection Tool)
- TechnicalSEO.com Robots.txt Tester
- Google’s Robots.txt Tester Tool
Syntax (Structure) of Robots.txt File
A robots.txt file contains two main elements:
- User-agent: It specifies which search engine bot the rule applies to.
- Disallow/Allow: It defines which pages or folders should be crawled or not.
Example Robots.txt File:
User-agent: * Disallow: /admin/ Disallow: /cart/ Allow: / |
Explanation:
- User-agent: *: Applies the rules to all search engines.
- Disallow: /admin/: The /admin/ page should not be crawled.
- Disallow: /cart/: The shopping cart page should not be crawled.
- Allow: /: All other pages can be crawled.
Also Read: How to Improve Core Web Vitals for Better SEO and User Experience
Real-Time Example: Robots.txt in E-commerce Websites
Example 2: Breakdown of Amazon’s Robots.txt File
If you check Amazon’s robots.txt file, you’ll find something like this:
User-agent: * Disallow: /cart/ Disallow: /checkout/ Disallow: /gp/ Disallow: /hz/ |
Why is Amazon Doing This?
/cart/ (Shopping Cart Page): Google indexing this page is useless since it’s only relevant to users.
/checkout/ (Payment Page): This should not be indexed as it is only accessible to logged-in users.
/gp/ and /hz/ (Internal Pages): These are Amazon’s internal tracking pages, which do not need to be visible to Google.
Special Cases in Robots.txt (Advanced Concepts)
1. Setting Rules for a Specific Search Engine Bot
If you want to allow/disallow only Googlebot, you can write:
User-agent: Googlebot Disallow: /private/ |
This means only Googlebot is restricted from crawling the /private/ folder, while other search engines can crawl it.
2. Adding a Sitemap in Robots.txt
If you want Google to quickly discover your XML Sitemap, you can add it to robots.txt:
User-agent: * Disallow: Sitemap: https://www.example.com/sitemap.xml |
This means there are no disallow rules, and Google is instructed to crawl the sitemap.xml.
3. Completely Blocking Search Engines
If you want to block your entire website from search engines, use:
User-agent: * Disallow: / |
This means no search engine will be able to crawl your website.
4. Allowing a Specific Page (Even If the Parent Folder is Blocked)
If you block an entire folder but want to allow a specific page within it, write:
User-agent: * Disallow: /blog/ Allow: /blog/best-seo-tips.html |
This means /blog/ folder is blocked. The page best-seo-tips.html is allowed.
Also Read: How to Learn Digital Marketing for Free with Certification?
Limitations of Robots.txt (Important Points)
1. Robots.txt is NOT 100% Secure!
Even if you disallow a page in robots.txt, it will still be publicly accessible.
Example: If you block /private-data/, users can still access it directly via:
https://www.example.com/private-data
2. Google No Longer Supports “Noindex” in Robots.txt
Earlier, we could use:
Disallow: /page/ Noindex: /page/ |
Now, Google does not support noindex in robots.txt. Instead, use the “noindex” meta tag inside the page’s HTML.
3. Crawling vs. Indexing
If you write Disallow: /page/, Google won’t crawl the page.
However, if there are external links to this page, Google can still index it. The solution is to use both robots.txt and a “noindex” meta tag for complete removal from search results.
Also Read: What is Technical SEO & How to Optimize for Google Rankings?
Why is Robots.txt Important for SEO?
Here is why robots.txt is important for SEO:
- Crawling Control: This helps search engines understand which pages to crawl and which to ignore.
- Avoiding Duplicate Content: It prevents search engines from crawling duplicate pages.
- Server Load Optimization: It reduces unnecessary crawling, improving website performance.
- Security: It helps hide sensitive URLs (like admin panels) from search engine crawlers.
What is HTTPS?
HTTPS is a short form of HyperText Transfer Protocol Secure and a secured version of HTTP. It provides data encryption to ensure that the communication between the user and the website remains secure. Let’s understand this with the help of real-time examples.
Simple Example:
Imagine you are sending money to a bank. So:
- If you use HTTP, it’s like sending money without any security, making it easy for someone to steal it in transit.
- If you use HTTPS, it’s like locking the money in a secure locker, ensuring that no one can intercept it.
Examples of Real-Time HTTPS Websites:
- Google: https://www.google.com
- Amazon: https://www.amazon.com
- Facebook: https://www.facebook.com
- Flipkart: https://www.flipkart.com
You can check if a website has SSL or not with an SSL Checker Tool:
https://www.sslshopper.com/ssl-checker.html
Why is HTTPS Important?
HTTPS is necessary if you run a website. Below is why it is important:
SEO Benefits: It improves the Google ranking of your website as Google does not give priority to HTTP.
- Security: The secured version of HTTP keeps user data safe.
- User Trust: The padlock symbol increases credibility.
- Compliance: It meets GDPR & PCI DSS security standards.
Also Read: Meta Tags in SEO: The Ultimate Guide to On-Page Optimization
Difference Between HTTPS and HTTP
Feature | HTTP | HTTPS |
Security | Not secure | Secure with SSL/TLS encryption |
Data Encryption | Data is transferred in plain text | Data is encrypted |
SEO Benefit | Google does not prefer HTTP sites | Google boosts ranking for HTTPS sites |
Browser Indication | Shows “Not Secure” warning | Shows a green padlock or “Secure” |
Performance | Slightly faster | Slightly slower, but Cloudflare or similar tools can optimize the speed |
Example:
- HTTP Website: http://www.example.com
- HTTPS Website: https://www.example.com
If you open an HTTP site in Chrome or Firefox, you’ll see a “Not Secure” warning and that will say don’t enter any sensitive information as the security certificate of the site is not available or expired.
What is SSL/TLS?
SSL (Secure Sockets Layer) and TLS (Transport Layer Security) are encryption protocols that ensure user data like passwords, credit card details, and personal information remains secure and there is no risk of data leakage or hacking. The prime objective of SSL/TLS is to encrypt data/information using mathematical algorithms and make it unreadable to anyone who doesn’t have the decryption key.
How Does SSL/TLS Work?
When you visit an HTTPS site, the following steps take place:
- The browser sends a request for a secure connection.
- The server responds with an SSL/TLS certificate for verification.
- An encryption key exchange happens (handshake process).
- A secure connection is established, and all data is encrypted before transfer.
Example:
If you are entering your details on Amazon, and the site is using HTTP then there are chances a hacker could steal your login credentials or card details as the data is not secured and is accessible to anyone.
But with HTTPS (SSL/TLS), all data is securely encrypted and no one can decode it.
Also Read: What is Keyword Research and How to Do it?: A Step-by-Step Guide
What is the Role of an SSL/TLS Certificate?
The primary role of an SSL or TLS certificate includes:
- Encryption: This certificate ensures that any data entered by the user is encrypted before reaching the server.
- Authentication: It confirms that the website is legitimate and not fake or fraudulent.
- Data Integrity: It ensures that data remains unchanged during transmission.
Impact of HTTPS on SEO
In 2014, Google officially announced that HTTPS is a ranking factor. If your website still uses HTTP, Google will consider it less secure, which can hurt your rankings.
Example:
If two websites have the same content, but one uses HTTPS and the other HTTP, Google will give a ranking preference to the HTTPS site.
So conclusively we can say that
- HTTPS protects users’ data and builds trust.
- Google boosts rankings for secure HTTPS websites.
- Having an SSL/TLS certificate is essential for security, SEO, and user trust.
How to Implement HTTPS? (Step-by-Step Guide)
If you want to shift your website from HTTP to HTTPS, follow these steps:
Step 1: Purchase an SSL Certificate
Free SSL Options:
- Let’s Encrypt (Most hosting providers offer it for free)
- Cloudflare SSL
Paid SSL Options:
If you run an e-commerce site or use a payment gateway, a paid SSL is recommended.
- Single Domain SSL for one domain
- Wildcard SSL for multiple subdomains
- EV SSL provides extra security and a green address bar
For example, if you have an e-commerce site where customers enter credit card details, you should opt for EV SSL for maximum security.
Step 2: Install SSL on Your Website
The installation process depends on your hosting provider:
- cPanel Users: Go to “SSL/TLS” in cPanel and install SSL.
- Cloudflare Users: Activate SSL in Cloudflare and select Flexible or Full Mode.
For example, if you use Hostinger, Bluehost, SiteGround, or GoDaddy then SSL might be free or require manual installation.
Step 3: Redirect Your Website to HTTPS
Once SSL is installed, you need to set up a 301 Redirect to ensure all old HTTP URLs automatically redirect to HTTPS.
For Apache Server (.htaccess file):
RewriteEngine On RewriteCond %{HTTPS} off RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301] |
For Nginx Server:
server { listen 80; server_name example.com www.example.com; return 301 https://example.com$request_uri; } |
This ensures that all HTTP requests automatically redirect to HTTPS.
Step 4: Update Google Search Console & Analytics
- Go to Google Search Console: Click on “Add a Property”
- Submit your new HTTPS URL
- Update your Sitemap
- Check robots.txt to ensure HTTP pages are not blocked
Example:
If your old sitemap was http://www.example.com/sitemap.xml then update it to https://www.example.com/sitemap.xml
Step 5: Update Internal Links & CDN
- Change all internal links from HTTP to HTTPS
- If you are using Cloudflare or a CDN then enable HTTPS there as well.
Example:
If your CSS or JavaScript files still use HTTP, update them:
<link rel=”stylesheet” href=”https://www.example.com/style.css”>
Wondering why? If internal links still use HTTP, you might get a “Mixed Content Error”. This is why it is crucial to update everything to HTTPS.
So conclusively we can say that:
- SSL ensures security and encrypts data
- Google ranks HTTPS sites higher
- 301 Redirects and updates internal links to prevent errors
Also Read: What is SEO? How Does It Work? Crawling, Indexing, and Ranking
Common HTTPS Issues & Solutions
Issue | Solution |
Mixed Content Error | Convert all internal links, images, and scripts to HTTPS |
SEO Ranking Drop | Set up 301 Redirects and update Google Search Console |
SSL Certificate Expired | Renew the SSL certificate |
Slow Loading After HTTPS Migration | Enable Cloudflare or HTTP/2 |
Types of SSL Certificates: DV, OV, EV Explained
There are three major types of SSL certificartes that are based on validation level and trust level. Let’s understand them with real-world examples:
1. Domain Validation (DV SSL) – Basic Security
DV SSL is the most basic and easiest SSL certificate. It only verifies domain ownership (ensuring you own the domain). The Certificate Authority (CA) verifies ownership via email or DNS verification. No business verification or extra documentation is required.
Real-life Example:
If you run a personal blog or portfolio website, DV SSL is sufficient. Examples:
- foodiekrishan.com (A personal food blog)
- photographybykrishan.com (A photography portfolio)
You only need basic security to avoid the “Not Secure” warning, making DV SSL the best choice!
Pros:
- Fast approval: Can be issued in just a few minutes
- Free options are available ike Let’s Encrypt
Cons:
- Lower trust level: Since only domain ownership is verified and not business identity
- Not recommended for e-commerce or financial sites
Organization Validation (OV SSL) – Business Trust
OV SSL is a more advanced SSL certificate that verifies both domain ownership and business details. The Certificate Authority (CA) verifies your domain, business name, address, and registration. Legal business documents must be submitted to get this certificate. The business name appears in the SSL certificate.
Real-life Example:
If you run a small business website that handles customer data then OV SSL is recommended.
Examples:
- skillwaala.com (E-learning website)
- krishanconsulting.com (Consulting services)
- localhospital.com (Healthcare services)
Since these businesses handle user logins or small transactions, trust is crucial. OV SSL gives them extra credibility.
Pros:
- Higher trust level: Users see the business identity verification
- Protection against phishing & fake websites
Cons:
- It takes longer to approve (1–3 days) due to business verification
- It is more expensive than DV SSL
Extended Validation (EV SSL) – Maximum Trust
EV SSL is the highest-level SSL certificate that comes with full business verification. It is a must for banks, e-commerce, and high-security websites. To get this certificate, the Certificate Authority (CA) verifies your legal business documents, registration, address, and phone number. Sometimes, a manual verification call is required. This process takes the longest time (3–7 days) because it is a high-security SSL.
Real-life Example:
If you handle financial transactions, online payments, or sensitive customer data, EV SSL is a must. Examples include:
- hdfcbank.com (Banking website)
- amazon.com (E-commerce website)
- paypal.com (Online payments)
Earlier, EV SSL websites displayed a green address bar (now removed), but they still offer maximum trust.
Pros:
- 100% user trust because the entire business is verified.
- Best protection against fraud & phishing attacks.
Cons:
- Slow approval (can take 3~7 days).
- Very expensive (can cost ₹10,000+ per year)
Also Read: The Ultimate Beginner’s Guide to WordPress
Which SSL Certificate Should You Choose?
Look at the table given below to make an informed decision.
SSL Type | Security Level | Use Case | Approval Time | Cost |
DV SSL | Basic | Personal blogs, small websites | Minutes | Free – ₹1,000 |
OV SSL | Medium | Business websites, login-based sites | 1–3 Days | ₹2,000 – ₹5,000 |
EV SSL | High | Banks, e-commerce, payment websites | 3–7 Days | ₹10,000+ |
Conclusion
We hope you might have understood the concept of Robots.txt, SSL/TLS certificates, and HTTPS. These three portions are important when you embark on a journey to learn SEO. If you have any doubts on these topics, drop a message on our WhatsApp number or fill in the query form so that our trainer can connect with you to clear your doubts. Till then keep practicing and don’t forget to watch our classes on Skillwaala’s YouTube Channel.