Subscribe to Learn Hacking

Enter your email address
Please don’t forget to click activation link in your email.

January 16, 2012

HOW TO VIEW HIDDEN DIRECTORIES IN A WEBSITE USING ROBOT.TXT

Welcome to "HACKING begins - An approach to introduce people with the truth of HACKING".  Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. Many a times in hacking you need to know the web directories. But its hard to find it out as a server might contain uncountable no. of Directories in it. But a major hole in this security is the Robots.tx.


What is Robots.txt



Robot.txt is the file in the [ wwwroot ]of a server to define the Bots the Functionality on the website.

There are many Bots on the internet like the most famous are Google Search Engine BOT aka Google Spider , Yahoo Search Engine Bot and many others.
What robots.txt do is that it gives orders to the Bot on how to spider on the website….

Now you may ask what’s the use of Robots.txt file.
Well it is used by webmasters to add functionality to the upcoming bots on there website and also to hide the directories in the website and where the Bot should not go and spider.


Analyzing Robots.txt For Hacking Stuff



Well its really simple, the first question you would ask is Where is robots.txt Located ?
The answer is its in the [ WWWROOT ].
Don't Understand till yet , its in the main Directory.
Lets take the example of  HACKING begins)






Go Ahead and type it in the Address bar of your Browser then what do you see,


Do you see that , this is the Robots.txt for the Hacker the dude website Now lets first Analyze this Robots.txt


First Line :-
User-agent: Mediapartners-Google

This means that the above statements are given for the Google Search Engine Bot i.e. Google Spider.


Second Line :-
Disallow:


 This mean that nothing is disallowed to the Google Bot, Remember these Orders are given to the Google bot only not other bots.


Third Line :-

User-agent: *


This means that now all the bots coming to the blog will follow these rules.note that previous rules were for only Google Bot.


Fourth Line :-

Disallow: /search


This means that all the bots will not spider the files under the directory /search in the following Blog.


Fifth Line :-

Sitemap: http://hackingbegins.com/feeds/posts/default?orderby=updated
This is basically my blogs sitemap. Not very important.

For more info :: Robot.txt

Be a real hacker - PROFESSIONAL, and change the trend of HACKING.

Thanks & Regards:

Sahil Mahajan
 

11 comments:

wow awsome example.but i have a query regarding robots.txt.can we see hidden directory of any website and what is the purpose to find hidden files.what can we do from these files and i have check one website by put robots.txt command in last of the url but didn't able to see to anything .page not found error is occured.

nice it worked on http://www.bsnl.co.in/robots.txt

User-agent: *
Disallow: /_mm/
Disallow: /_notes/
Disallow: /_baks/
Disallow: /MMWIP/
Disallow: /company/jtoresult06/

User-agent: googlebot
Disallow: *.csi

ye aaya ab iska kya karna hai???

i m your 88888 page viewer ....

User-agent: *
Disallow: /_mm/
Disallow: /_notes/
Disallow: /_baks/
Disallow: /MMWIP/
Disallow: /company/jtoresult06/

User-agent: googlebot
Disallow: *.csi


i did't understand above command meaning.plz tell me in deeply what is the meaning of above lines.

my frd,plz tell next step...

User-agent: Googlebot
Disallow: /ac.php
Disallow: /ae.php
Disallow: /album.php
Disallow: /ap.php
Disallow: /autologin.php
Disallow: /checkpoint/
Disallow: /feeds/
Disallow: /l.php
Disallow: /o.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photo_comments.php
Disallow: /photo_search.php
Disallow: /photos.php


plz tell me sir isse kyaa hogaaa plz tell the detail

how can i run file without any permission...

This is one of the things taught in Web Development 101. Those are not "hidden directories", and this really doesn't qualify as "hacking" of any sort...

Post a Comment

I hope you got some great ideas in this post! Please feel free to share additional ideas or query.