Posts tagged seo

Measuring the quality of your blog

0

Although there are a lot of online resources to measure the wuality of your blog, like Blog Juice, Website grader (The bottom of the resports is dedicated to blog) and others like Technorati Authority, all of this services focus on popularity and sometimes SEO aspects.

Taking into account that all a blog is about is people reading your content, the best way to measure your site's quality is to use Google Analytics and check the time spent per visit, and the time spent per page. This way if you see 90% of the visits spend less that 10 seconds on your page, who cares if your traffic is 10k, your blog does not have quality. It may be the design, the content, something that is making the users leave even before they had time to know what it was about.

Time spent on the site, Google analytics graph

As you can see in the image, you have an average, a daily basis, and a visual graph displaying the variations. An ideal time would the one it takes the user to read the article and maybe post a comment. Most users will take about a minute to do all that. If your average is under a minute, then you must do something, go through the stats, page by page, and check how much time they spend on each page, that gives you an insight of what's happening.
Also check the main exiting pages (Which will usually be the same ones they entered) and sometimes you'll see a particular page being exited too much, it is usually because of the article title, an egaging title will keep your visitors on the site, at least long enough to determine whether your article is interesting or not.

Something that is usually a great way of keeping your users on the site is putting an image always on each article. It gives a visual representation of what the article is about, and as you know, an image is always understood faster than a whole paragraph.
Another great way to help the reader know what the article is about is highlighting important words. It helps the user know where to look when skimming through the article.

Follow this simple tips and you'll increase the time users spend on your blog, which hopefully will mean that they will start to read your articles from start to end!

Optimizing your landing page

4

Optimize your landing page and get more visitors
There is no need for me to prove to you that an optimization in your website's landing page will improve your Return on Investment (ROI) and your Rate of Return (ROR). As a matter of fact it is probably one of the hardest, most over-looked, and most important parts of SEO for a website. It is merely the showcase of your business, it is a fact that most of the visitors will simply decide whether to stay or not by simply looking at your home page.
Then, what should I be doing to prevent this? I've thought of this same question, and I've came up with the following statements, check right now how well you are doing the job:
(They are not in any particular order)

  1. Your homepage is persuasive and it makes the visitors feel confortable:
    A common mistake is writing on the go the content you display and forgetting to check for errors, the text you write on your homepage is probably the most important one, and errors on it will make you lose all your credibility. Other usual errors are the lack of images, low resolution images, unconnected images... The visual aspect is even more important, since it determines the visitor's first impression.
  2. You provide easy navigation that focuses on the most important links
    You must always show the visitor clearly what the next step should be. Brighter colors, out stand from the other links, do it however you want, but make sure that there is a path from the homepage to the result you want to achieve, whether it is to contact you, view your showcase, buy your product...
  3. Your content resolves any possible doubt the visitor might have
    People won't know what your incredible product is, so for example an "About" page is a must-have. If you are unsure on how to organize the doubt-resolving content consider building a FAQ, which provides easy access to all possible problems a visitor will encounter
  4. Your copy has been optimized to capture visitors
    The content you write is the most important part once a visitor has gone past the homepage, he has already decided that your site "looks" worth watching, now you must prove it with content. Test whenever possible what is best for your niche, long text? short text?
  5. Your idea is clear after no more than 4 seconds of looking at your homepage
    If that was not true then you have to redo it. Your main point, idea, product, basically what you are offering must be completely clear for the visitor after no more than 4 seconds. Too much information without something in common and the visitor will probably be out, visitors want to know what they will find before having to actually search for it.
  6. You think your homepage is great
    This is the most important one, Do you think your homepage is great? If you don't then don't waste your time reading this lists, delete it and start from scratch, do something you consider good, then you can bother with the tiny details.

Hope you could agree with all the statements!

Writing a good robots.txt file | SEO Tips

3

Search Bots, crawl each URL and the first thing they search on an URL root is the robots.txt file. So if we make our robots.txt file, we can change the Search Bots' behaviours, and we can tell them where to search and publish and where to not. Imagine we have privacy folders in our website, for example if we have folder or a file containing e-mail addresses we don't want published, we can avoid Search Engine robots' visits using a few simple commands on the robots.txt file. Here we go:

Introduction

We use the /robots.txt file to give instructions about our site to web robots; this is called The Robots Exclusion Protocol.

Simply, the robots.txt is a very simple text file that is placed on our root directory. For example http://urbanoalvarez.es/robots.txt. This file tells search engines and other robots which areas of our site they are allowed to visit and index.
The only thing you must take into account is that ONLY one robots.txt file is allowed on our site and ONLY in the root directory (where our home page is)

TRUE: http://urbanoalvarez.es/robots.txt (Works)

FALSE: http://urbanoalvarez.es/images/robots.txt (Doesn't work)

All major search engine spiders respect this file, but unfortunately most spambots (email collectors, harvesters) do not. If you want security on your site or if you have files or contents to hide, you have to actually put the files in a protected directory, you can't trust the robots.txt file only.

Setting up our file

So what programs we need to create it? Just the good ol'notebook or any text editor program. All we need to do is to create a new text file, and rename it! Attention, the name has to be "robots.txt", cannot be "robot.txt" or "Robot.txt" or "robots.TXT".
Simple, no Caps and robots!

Writing the rules

Now that we are starting to write in it, a simple robots.txt looks like this.

User-agent: *
Disallow:

The "User-agent: *" means this section applies to all robots, the wildcard "*" means all bots. The "Disallow: " tells the robots that they can go anywhere they want.

User-agent: *
Disallow: /

A wildcard "*" is used in this one too, so all bots must read this. But in this one, there is a little difference, a slash "/" in the Disallow line, which means "don't allow anything to be crwaled", so the bots won't crawl you website, the good ones of course ;)

If we want all the bots to read this text file, we should insert a "wildcard (*)" in the User-agent line. And when we leave the Disallow: line blank, it means come crawl my site you bots!, and when there is a slash it means keep out! Simple. This is the simplest way, now we can learn how to keep some bots crawling and some not.

Advanced rules

The User-agent line is the part we are going to work on to define the bot's identity and behavior. For example if we want the google bot to crawl the site but the yahoo bot not, how will our text file look?

User-agent: googlebot
Disallow:

User-agent: yahoo-slurp
Disallow: /

In this sample, we called the googlebot and left the disallow line blank so we said crawl my website. And in the second line we called the yahoo bot but in the disallow line we have a slash so we wanted it to go away.

Now we are going to learn how to avoid some folders of our site getting searched by search engine spiders and how to get some folders to be searched at the same time. For this, we will change the values in the disallow line. For example we have two folders in our domain, /images, and /emails. We want /images to be searched but /emails not. Then the text file would look like:

User-agent: *
Disallow: /emails/

As we can see, we called all the robots to read this, and we don't want the /emails folder to be seen, we excluded it but the rest of the website can be crawled by the robots.

Common samples

Here are few samples to make it clearer.
To exclude all folders from all the bots:

User-agent: *
Disallow: /

To exclude any folder from all the bots:

User-agent: *
Disallow: /emails/

To exclude all folders from a bot:

User-agent: googlebot
Disallow: /
User-agent: *
Disallow:

To allow just one bot to crawl the site:

User-agent: googlebot
Disallow:
User-agent: *
Disallow: /

To allow all the bots to see the all folders:

User-agent: *
Disallow:

Important tips

After learning these, I believe you guys got it. Now there are a few rules that we should know. We can't use a wild card "*" in the Disallow line, bots don't read it then ( Google and MSNbot can). so a line like "Disallow: /emails/*.htm" is not a valid line for most bots. Another rule is, you have to make new user-agent and disallow lines for each specific bots, and you have to make a new disallow line for each directory that you want to exclude. "user-agent: googlebot, yahoobot" and "disallow: /emails, /images" are not valid.

Robots can ignore your /robots.txt. Especially malware or spam harvesting robots that scan the web for security vulnerabilities and email addresses will pay no attention.
The /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use. So don't try to use /robots.txt to hide information.

Is it possible to allow just one file or folder or directory to be crawled and the rest not? Simply there is no allow line in robots.txt, but mentally yea that can be done. How? You can insert all the files that you don't want to be seen in a folder and disallow it.
For example, "Disallow: /files_that_I_dont_want_to_share/ "

Major robots

Major Known Spiders

Googlebot (Google), Googlebot-Image (Google Image Search), MSNBot (MSN), Slurp (Yahoo), Yahoo-Blogs, Mozilla/2.0 (compatible; Ask Jeeves/Teoma), Gigabot (Gigablast), Scrubby (Scrub The Web), Robozilla (DMOZ)

Google

Google allows the use of asterisks. Disallow patterns may include "*" to match any sequence of characters, and patterns may end in "$" to indicate the end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images), you'd use the following robots.txt entry:

User-agent: Googlebot-Image
Disallow: /*.gif$

Yahoo

Yahoo also has a few specific commands, including the:

Crawl-delay: xx instruction, where "xx" is the minimum delay in seconds between successive crawler accesses. Yahoo's default crawl-delay value is 1 second. If the crawler rate is a problem for your server, you can set the delay up to up to 5 or 20 or a comfortable value for your server.

Setting a crawl-delay of 20 seconds for Yahoo-Blogs/v3.9 would look something like:

User-agent: Yahoo-Blogs/v3.9
Crawl-delay: 20

Ask / Teoma

Supports the crawl-delay command.

MSN Search

Supports the crawl-delay command. Also allows wildcard behavior

User-agent: msnbot
Disallow: /*.[file extension]$

(the "$" is required, in order to declare the end of the file)

Examples:

User-agent: msnbot
Disallow: /*.PDF$
Disallow: /*.jpeg$
Disallow: /*.exe$

Why do I want a Robots.txt?

There are several reasons you would want to control a robots visit to your site:

  1. It saves your bandwidth - the spider won't visit areas where there is no useful information (your cgi-bin, images, etc)
  2. It gives you a very basic level of protection - although it's not very good security, it will keep people from easily finding stuff you don't want easily accessible via search engines. They actually have to visit your site and go to the directory instead of finding it on Google, MSN, Yahoo or Teoma.
  3. It cleans up your logs - every time a search engine visits your site it requests the robots.txt, which can happen several times a day. If you don't have one it generates a "404 Not Found" error each time. It's hard to wade through all of these to find genuine errors at the end of the month.
  4. It can prevent spam and penalties associated with duplicate content. Lets say you have a high speed and low speed version of your site, or a landing page intended for use with advertising campaigns. If this content duplicates other content on your site you can find yourself in ill-favor with some search engines. You can use the robots.txt file to prevent the content from being indexed, and therefore avoid issues. Some webmasters also use it to exclude "test" or "development" areas of a website that are not ready for public viewing yet.
  5. It's good programming policy. Pros have a robots.txt. Amateurs don't. What group do you want your site to be in? This is more of an ego/image thing than a "real" reason but in competitive areas or when applying for a job can make a difference. Some employers may consider not hiring a webmaster who didn't know how to use one, on the assumption that they may not to know other, more critical things, as well. Many feel it's sloppy and unprofessional not to use one.

So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.

Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.

Major Search Engine Bots - Spiders Names

Google = googlebot
MSN Search = msnbot
Yahoo = yahoo-slurp
Ask/Teoma = teoma
GigaBlast = gigabot
Scrub The Web = scrubby
DMOZ Checker = robozilla
Nutch = nutch
Alexa/Wayback = ia_archiver
Baidu = baiduspider

Specific Special Bots:

Google Image = googlebot-image
Yahoo MM = yahoo-mmcrawler
MSN PicSearch = psbot
SingingFish = asterias
Yahoo Blogs = yahoo-blogs/v3.9

Main source of information can be found in this forum post by Paskall.
Feel free to ask any question, or correct my mistakes,
Cheers

DMOZ Submission Guide:

1

As an editor of the DMOZ Directory, I've realized what are the most important thing to be taken into account by web masters submitting their work:

  1. Make sure you submit to the most appropriate category. This is very important, as it will decide whether your site is updated into the directory or not. If a site is in the correct category the editor will probably add it to the directory if it is ok, but if the category is wrong the site will go to the bottom of the reviewing list of another category, resulting in months of waiting for you.
  2. Make your description as small as possible, editors don't like more than 2 lines of description, and that means 15-20 words.
  3. Never have "under construction" or "404" errors, if any of your pages is on error or under construction simply don't link to it.
  4. Make sure your homepage is on topic, with one quick glance most editors will know if they want to keep reading and browsing your site or not. So your homepage must be attractive and link to all your relevant information.
  5. Focus your description on the category topic, if you submit to Web development, you could put in the description "web development company in CA", try to avoid things like "Welcome to Urbano's Blog, your home for the best...", that is completely Anti-Dmoz.

In general use common sense when submitting, and don't forget to note down the editor's name, category name, and submission date, in case you need to contact DMOZ in the future to know how your site is going for example.
If after a long time (More than 5 months) you don't know about your site's inclusion you should consider contacting the editor or resubmitting, because there is a good chance that your site was moved to the bottom of the list of unreviewed sites.

Remove/Add the www? | SEO Tips:

3

I'm sure you all have seen how some sites have www, some not, and do you? Because for most web developers this might not seep as an important thing, even though it is essential. If you don't have chosen one, you may end up with a www.yoursite.com with PR 3 and yoursite.com with PR 1 for example.

www

Now to avoid this you must first decide what do you prefer, with or without. I personally like it without...

Google tools:

A very important one, go to Google Webmaster Tools, claim your site, and there you have the option to display the www. in searches or not.

Using .htaccess

Another way is by using mod_rewrite in .htaccess.
To use this option, create or edit your existing .htaccess file and add the following lines:
Adding www:

RewriteCond %{HTTP_HOST} ^domain.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

Remember to change domain.com with your domain!
Removing www:

RewriteCond %{HTTP_HOST} ^www.domain.com$ [NC]
RewriteRule ^(.*)$ http://domain.com/$1 [R=301,L]

More information about mod_rewrite and code snippets

Using php

Add the following script to the very top of your page, before anything else!
Adding www:

 
< ?php
if (substr($_SERVER['HTTP_HOST'],0,3) !== 'www') {
header('HTTP/1.1 301 Moved Permanently');
header('Location: http://www.urbanoalvarez.es'.$_SERVER['REQUEST_URI']);
}
?>
 

Removing www:

 
< ?php
if (substr($_SERVER['HTTP_HOST'],0,3) == 'www') {
header('HTTP/1.1 301 Moved Permanently');
header('Location: http://urbanoalvarez.es'.$_SERVER['REQUEST_URI']);
}
?>
 

Remember to change "urbanoalvarez.es" with your domain name.

Other ways

If you don't have php or the module mod_rewirte installed, you could do a simple 301 redirection.

If you don't understand it, or you find a better way of achieving this, please comment :D

Go to Top