nav-left cat-right
cat-right

Removing wp_head() elements (rel=’start’, etc.)...

In customising WordPress you may find a need to occasionally remove or add to the Link elements that WordPress automatically outputs in the function call wp_head(). I’ve recently had a need to remove the rel=’prev’ and rel=’next’ link elements and in trying to avoid customising the core WordPress functions the following solutions works. Ensure you have a functions.php file in your theme directory that you are using. If not create the file and edit the file. The following lines will help remove select lines from your wp_head() function: remove_action( 'wp_head', 'feed_links_extra', 3 ); // Removes the links to the extra feeds such as category feeds remove_action( 'wp_head', 'feed_links', 2 ); // Removes links to the general feeds: Post and Comment Feed remove_action( 'wp_head', 'rsd_link'); // Removes the link to the Really Simple Discovery service endpoint, EditURI link remove_action( 'wp_head', 'wlwmanifest_link'); // Removes the link to the Windows Live Writer manifest file. remove_action( 'wp_head', 'index_rel_link'); // Removes the index link remove_action( 'wp_head', 'parent_post_rel_link'); // Removes the prev link remove_action( 'wp_head', 'start_post_rel_link'); // Removes the start link remove_action( 'wp_head', 'adjacent_posts_rel_link'); // Removes the relational links for the posts adjacent to the current post. remove_action( 'wp_head', 'wp_generator'); // Removes the WordPress version i.e. - WordPress 2.8.4 Don’t remove these items unless you have a need to. The WordPress generator removal could be useful if you are not religiously upgrading your WordPress install as it helps hide the WP version from potential hackers to a certain...

HTTP Response Header Checker

HTTP Response Header Checker The HTTP Response is the information returned in the HTTP Protocol when you access URL’s over the Internet. Google, Yahoo and in fact all browsers rely on this information to determine if the information you are trying to access has been found or if not what may of happened to it. The full HTTP response contains a variety of information that a web server will send in response to a HTTP request. This information can yield interesting information such as the web server a site is hosted upon, the scripting language used and most importantly the response code. The following search box allows you to enter a URL and see the full HTTP Response. Why is this useful you may be thinking? Well Google, etc rely on the response codes to determine if they index your site. For a resource to be indexed you will most often than not be looking for a ‘200 ok’ response. If a page is missing you may get a ‘404 page not found’. If a page has gone you may look for a ‘410 Gone’ response to be sent back. Feel free to use this tool I have developed to test your URL’s HTTP response: Domain: Response...

The long forgotten robots.txt

I am still amazed at how many web sites still don’t employ a robots.txt file at the root of their web server.  Even SEO firms or people claiming to be SEO experts have them missing which I find very funny.  There also countless arguments of whether you still need to have a robots.txt, but my advice is if the search engine robots still request it then I’d rather have it there with the welcome mat to the site. For those of you who don’t know the history of a robots.txt file then i’d suggest you have a Google or Wikipedia for it.  In short it ‘s a text file that specifies which parts of a web site to ‘index’ and ‘crawl’ and/or which parts to not index.  You can also get specific and setup up rules based on a certain spiders and crawlers. To start with you need to create a text file called robots.txt and place in the root of your web host.  You should be able to access it through your web browser at www.yourdomain.com/robots.txt You can view other web sites robots.txt files by accessing the robots.txt at the root of their domain.   If you want Google, etc. to come into your site and index everything then things are very easy.  Simply add the following to your robots.txt file and away you go: User-agent: * Disallow: Alternatively if you wish to stop all pages in your site being indexed then the following should be present in your file: User-agent: * Disallow: / To stop robots indexing a folder called images and another called private you would add a Disallow line for each folder: User-agent: * Disallow: /images/ Disallow: /private/ The above would still index the rest of the site, but anything in those folders would be excluded from search engine results. To disallow a file you specify the file as above with a folder: User-agent: * Disallow: /myPrivateFile.htm If you only wanted Google access to your site you specify the following: User-agent: Google Disallow: User-agent: * Disallow: / If you are looking at getting your site fully indexed then I would put the first example in your robots.txt...

PageRank no longer important?

I keep hearing people saying that Google’s Page Rank is no longer important and X or Y is much more important. I still believe that Page Rank is one of the most important factors in a site ranking well in the Google results. Firstly, Google still include a page rank on the Google navigation tool bar – ok this is not updated immediately but I would assume this is based on infrastructure and speed issues. Secondly Google went to all the trouble to patent the page rank rating algorithm through Stanford University and I would doubt they are going to just leave this behind. The page rank has been the foundation of Google success and although it is undoubtedly tweaked and used along side other algorithms it is still in my opinion the basis of the initial search engine rankings. Thirdly, when you hear Google employee Matt Cutts discussing page rank in such detail you know it is still fundamental to how Google works – http://www.mattcutts.com/blog/more-info-on-pagerank/ . As Matt discusses in this article pagerank in the toolbar is usually late to reflect how your site is ranking. This however is just the navigation tool bar representation of this. Really behind the scenes page rank is changing all the time as the index is added to, pages removed and back links calculated. So there you have it -PAGERANK is still alive and soldiering on. Whilst Mr Cutts still talks about pagerank we know it is still one of the main factors in Google search engine...

‘Black-Hat’ SEO Techniques

The following SEO techniques are known as ‘black-hat’ SEO and should be avoided by your web site. These techniques can sometimes provide very quick search engine result improvements, but over time are known to cause sites being banned and removed from indexes altogether. Hidden Text An old technique to increase keywords on a page was to include long lists of keywords and key phrases as hidden text on a page. This was sometimes achieved by placing the text far below the main content on the page or by displaying the text as the same colour as a background colour, i.e. white on white or black on black. This technique goes against Google’s webmaster guidelines. Cloaking or Doorway Pages Don’t deceive your users or present different content to search engines than you display to users. Matt Cutts (Google employee) describes a classic case of this on his blog site (http://www.mattcutts.com/blog/ramping-up-on-international-webspam/) whereby BMW displayed text to search engine robots whereas normal web users would be shown other content via the use of a JavaScript redirect. It is important to not try to do quick redirects on pages so that the search engine crawlers see certain content and then a user’s browser redirects to another page Link schemes Avoid links schemes that provide a massive increase in incoming links from bad neighbourhoods and other sites of dubious content. Participating in such schemes can result in your site being penalised. Automated Search Engine Submission Software Avoid using search engine submission software such as ‘WebPosition Gold” or other similar products. As long as your site is linked to from other sites and your site is up and running with a valid robots.txt file then pages and content will be indexed without the need for this software. Duplicate Content Avoid duplicating the same content on different pages. If Google detects large amounts of duplicated pages on different sub or main domains then you can risk a ‘duplicate content penalty’ which can result in the site losing rankings. Sub domains such as above are all treated as separate websites and if duplicate content is found then both sites can suffer ranking problems until the original origin site of the content is determined by Google’s algorithms. Content Solely for Search Engines Avoid publishing any content that is solely for search engine spiders. Content that is too rich in keyword density and unintelligible to a human can be detected by Google and other search engines and result in site penalties. Always write content firstly for humans and secondly for crawlers and robots. As long as the pages are...

SEO Part 14 – Web Based Sitemap

A site map enables search engines to easily find all content on the site by following all the links from the sitemap page(s) With the implementation of Google sitemaps, discussed in the ‘Google webmaster tools’ section there is no longer a huge need from a Google SEO point of view for a web based sitemap. For other search engines however it would be beneficial to have a few pages on the site that linked to all content currently published. A single page can have overheads on the system for large sites and for the front end user this can be too big to download and view – therefore several pages of links could make up such a sitemap. The sitemap start page should be linked to front the front homepage as this is the most common page that search engine spiders will...