Archive for the 'Number Crunching' Category

Apr 04 2008

“Lack of Stats” -pr0n for the Last Week

There's been a rash of stats-porn, and anti-stats April fool porn, re this week on UK Political Blogs. I'm not interested in all of that, but the comments about blogs vs big media are interesting, especially with "Politics Portals" (my term) with their origins in the blogosphere on the way.

Since this is Free for all Friday again I thought I would make my contrarian contribution.

Comments Off

Mar 22 2008

Development of UK Political Presence in Open Directory

Published by admin under Irish Comment, Number Crunching

Introduction [This article is a draft version of a short research paper, so comments are more than welcome]. The Open Directory (wikipedia article) is a human-edited directory of categorised websites, which has long been treated as an authoritative directory. Traditionally, it is a place to have your website listed to establish favourable rankings in web searches. The site has lost some of it’s reputation over the last few years - due to alleged partiality by editors - but it is still one way of looking at the internet presence of UK political parties over time. In this article I’m going to look at how the presence of how the presence of the UK Party Political websites have developed on the Open Directory since 2000. I’ve looked at the UK Political Parties category, and how it has developed since 2000 - taking the first Internet Archive Snapshot of each year (2000, 2001, 2002, 2003, 2004, no data available for 2005, 2006 and 2007 ). If you are interested in a “softer” angle, and a look at how listings in sites such as the Open Directory and Wikipedia help get the attention of the internet public, then you should look at my Blog Platform column for Sunday March 23rd tomorrow over on the Wardman Wire. Overall Totals of UK Party Political Websites Overall Totals Note the absence of data in the Internet Archive for 2005. Still - pretty much the trend we would expect as the use of the web has become mainstream. Total Number of Websites by Party This graph shows the trends up until the middle of 2007. As of March 2008, the totals for March 2008 are: Labour 493, Tory 594, Lib Dem 695. My reflections: How did a much smaller party (Lib Dems) get almost half as many again party affiliated websites listed as the larger Labour Party? There are two possibilities - they either have more websites getting “natural” listings or they are being publicised more effectively or more strategically. I think it is the latter, but we cannot tell whether there is a centrally driven “directory listing” campaign, or a highly aware set of webmasters (perhaps encouraged by workshops at Conference etc). The Lib Dems (and to a lesser extent the Tories) clearly had some sort of website creation, or website listing, “internet push” around 2005 - the time of the last election. The trend continuing into 2008 indicates that the Tories seem to have “relaxed” while the Lib Dems have “kept on trucking”. To my mind, the Lib Dems are quite effective at targeted Internet campaigning - both around raising awareness of particular issues, but also in taking a long-term, strategic approach. To explain why I’d point to three factors: firstly, it is easier to mobilise and maintain that mobilisation in a smaller body; secondly, a smaller group has less resources - and internet promotion in more or less free; thirdly, the Lib Dems have a greater tradition of prioritising the local - and all this can be done by local activists; finally, the Lib Dems seem (to me) to have a tighter co-ordination between the central Organisation and the web-activists - and a stability in the way the two have worked together over 5 to 10 years. Whatever the reason, these numbers are quite impressive, and the Tories and Labour have some catching up to do. (more…)

Comments Off

Mar 19 2008

The Advantages of Tabloid Blogging

Last week Mr Eugenides (known affectionately on this site as “Hamish the Greek” - he is one person who can safely be treated with political incorrectness without running the risk of a tantrum) was the second most popular political blogger in Scotland (after Richard Leyton). This week he has blown Leyton into the weeds. Here are the stats from Blogtopsites for Mr E. You can ignore the spike on the left - that is an artefact of the start of the graph. The traffic is going mad this week. I wonder if it the pictures of six girls in bikinis or tight tops, the two of Alistair Darling behind bars, and the four I Can Has Cheezburger knockoffs featuring Tony Blair that have anything to do with it. Hopefully, it was the demolition of Gary Pugh idea to DNA test children. Shame on you, Mr E - probably. Tags: mr eugenides, tabloid blogging, alistair darling

Comments Off

Feb 27 2008

Analysis of Traffic Levels and Most popular articles on the Wardman Wire

Now that I have the links between my different websites in place, I have been looking at the amount of traffic being generated over the last month, and the most popular articles. Total Raw Traffic On this occasion I’ve processed all the raw log files using the free version of a utility called Deep Log Analyser, rather than relying on the data generated by a Wordpress Plugin. However, once caveat is that at least 4 of the sites (those which aggregate Parliamentary blogs - www.senedd.me.uk, www.holyrood.me.uk, www.europarl.me.uk and www.parliament.me.uk) are all less than a week old. So I have had to include some judicious estimates in the figures. Another difference is that certain files that are not part of the Wordpress installation are included in the numbers. The raw total of page impressions is 376,000 across all the 13 sites (the twelve in the toolbar and www.mattwardman.co.uk). More than half of these relate to www.mattwardman.com. The Impact of Files that shouldn’t count These are the top 5 files listed for www.mattwardman.com and what they are: Page Views - Filename - What is it? 14,880 Page Views. polls-js.php. This is used for in page polls which are refreshed without reloading the page. This should be excluded. 14,452 Page Views. podpress_js.php Part of the Podpress wordpress plugin. Not actually used on this blog. I should really find a way to exclude this. 13,872 Page Views. /blog/feed/index.php This is the home page for the RSS feed. This could be included or excluded depending on which statistics I am interested in. 11,747 Page Views. clickmanager.cgi This is the redirector programme “bounced off” when I need to count clicks on a link. This indicates 11,747 clicks on links in 30 days. I use it, for example, to count the clicks on the toolbar (hover over a button and see the “double” web address), and the clicks on stories in the Daily Roundup. This should be excluded. 10,710 Page Views. /blog/index.php At last one that counts. This is the Blog home page. It counts for very few impressions out of the total. I will return to this - it is a sign of how important blog archives are for attracting traffic. So - just to exclude 4 of these top 5 reduces the traffic to www.mattwardman.com by roughly 50,000 page impressions over the raw log files. Counting from inside Wordpress are cleaner, but still have a lot of “gunge” in the data. So what is a Reasonable Total? I am happy to quote a total number of page impressions for this 30 day period of “around 250,000” - a reduction of a third. But having done that - 250,000 page impressions in a month on a set of sites that are mainly only 8 months old is OK. The figure for the main www.mattwardman.com site is around 130,000-140,000 page impressions (with approximately another 30,000 or so for www.mattwardman.co.uk). These figures themselves roughly tally with the numbers given by the Slimstat-EX Wordpress plugin (140,000 and 35,000 respectively - also probably containing some search crawling). The Real Top Ten Pages on www.mattwardman.com After filtering out the noise, the following are the Top Twelve pages on the site in the last 30 days. Go and have a look at the links, and write down your conclusions - then read my notes below. Page Views - Date - Title and Link 3942 Page Views - 20070912 - In memory of free speech - Jesus and Mo - serious 3825 Page Views - 20070501 - Double Trouble - Posh Spice and Ananova - humour, morning funny 2130 Page Views - 20071010 - Wordpress Plugin - Category Images - tech tip 1616 Page Views - 20080213 - ABC Rowan Firestorm was started by the BBC - serious analysis 1422 Page Views - 20070904 - New Scottish Government launches official website - satirical 1280 Page Views - 20070905 - This posting may contain nuts - serious but funny - health and safety series 998 Page Views - 20070404 - Double Trouble - Guido Fawkes and Zorro - humour, morning funny 995 Page Views - 20070411 - Double Trouble - Morgan Lifecar and Thomas the Tank Engines - humour, morning funny 849 Page Views - 20071016 - Lib Dem leadership contest to replace Ming Campbell - humour, Lib Dems = box of ferrets 817 Page Views - 20070609 - Video Game Battle between Sony and Church of England - serious analysis 812 Page Views - 20080211 - Britblog Roundup - Ideas for Avoiding the Archbishop - serious 709 Page Views - 20070815 - Do Health and Safety Professionals Get too Much of a Kicking - serious - health and safety series (more…)

Comments Off

Feb 21 2008

How many Local Councillors are there?

Iain Dale wants to know: How many different forms of councillor there are - ie how many district councillors, how many county councillors etc. I rang the LGA who told me to ring the Office of National Statistics. They had no idea. I’ve tried googling it but no luck. Can anyone help? [Update 12pm. David Boothroyd supplied the gen: As of now at this moment in time: London borough councillors: 1,861 English county councillors: 2,270 Metropolitan borough councillors: 2,555 English unitary authority councillors: 2,407 English lower-tier district councillors: 10,575 Welsh unitary authority councillors: 1,264 Scottish unitary authority councillors: 1,222 Northern Ireland district councillors: 582 Grand total: 22,736 Parish councillors are not counted as principal local authorities but according to the National Association of Local Councils there are nearly 100,000 Parish, town and community councillors in England and Wales. ] Leaving aside that he’s asked 2 different questions: How many types of Councillor. It is not clear whether Town and Parish Councillors are of interest. How many of each type. … five minutes digging reveals the totals not the types. For England for 2006: Over 20,000 elected councillors represent local communities and local people on the 410 local authorities of England and Wales. Employing over two million people, these local councils undertake an estimated 700 different functions. Of which, concerning Wales: There are over 1200 councillors serving on Wales’ 22 all purpose local authorities - these are responsible for ?4 billion of public expenditure which is over one third of the total Welsh budget. There should be a PDF brochure here, but it is on a government-run site so the link is broken. Scotland has 32 councils and 1222 Councillors: The Directory provides a unique one-step source for: All 1222 Councillors You can buy a directory for about ?120. Northern Ireland has 26 local councils and 582 Councillors. That’s the best I can do for now, but for exact numbers I’d be looking at the Boundary Commission for England and their equivalents, not the Local Government Association. Or some commercial databases. I’ll revisit this later today if I have time. Useful Links The Northern Ireland LGA website has a useful breakdown with councillors by council. The BBC Action network has a useful background (not stats) page about Councillors, and one about Parish/Town Councillors (basically they cover street furniture, parks, cemeteries and other “local environment” things.). Tags: statistics, number crunching, local councillors england, local councillors ireland, local councillors scotland, local councillors wales

Comments Off

Jan 30 2008

Good Blogging Ideas: Focus on RSS Subscribers - they will come back

RSS subscribers represent visitors who come back to your site repeatedly. I took a look at my statistics last night, and the Wardman Wire main site is up to about 130. Here are a couple of diagrams from Feedburner relating to the Wardman Wire. RSS subscribers tend to build over extended periods of time - the Devil, for example, who has been blogging for 3 years or so, currently has just under 700 subscribers. This is the gradual buildup: Note that I didn’t implement feedburner until June 2007.   (more…)

Comments Off

Jan 29 2008

Milestone: 20,000 unique visitors in one month on the Wardman Wire

The Wardman Wire made an important milestone this afternoon: we just went past 20,000 measured uniques in a month for the first time. On this occasion, I’m not apologising for posting statistics - since it’s taken 9 months and a lot of work to get to this point. Figures for the Wardman Wire These are a couple of screenshots of the display from the “Slimstat-EX” plugin. Firstly, the summary. You can click through for a fuller screenshot illustrating how our traffic profile is rather different (less purely “politico” than most political blogs in the UK). “Visits” in this screenshot means “Unique visitors” (which is defined as the number of different internet addresses from which people visit during each individual day, summed across the month). “Hits” means page impressions. And a sorted version showing monthly figures since we started: These figures are not filtered for *all* search engines (it would slow the blog down dramatically), which is why I emphasize the “uniques” not the “hits”. The hits figures are likely to be high by perhaps 10-20%. The uniques figures will also be slightly high, but much less so than the hits. There are two major distortions in these figures. The July 2007 figures went haywire because I posted an 18th birthday interview with Daniel Radcliffe (Harry Potter) at a cricket match. It is now up to 287 comments, which is ludicrous - including one from the man himself. And the “Hits” (page impressions) figure for December is inflated by perhaps 25,000 over and above the 10-20% I mention above, since I left the “search engine pinger” turned on by mistake while posting 100 or so cartoons to appear on the blog between New Year and the middle of April 2008. So the real “hits” figures for December 2007 and January 2008 are likely to be around 100,000 to 110,000 in my estimation. And the UK Edition The real figure for the uniques for the main site in January is likely to be 18,000 or 19,000, but fortunately there are another 4,300 or so who visited the UK Edition (again - click through for more detail) so it still comes in at rather more than 20,000. Before anyone asks, I have not got the foggiest idea why a search for “development For sweater opportunity building” should land on my site, unless it was a visit from Gyles Brandreth. Wrapping Up OK - enough statistical self-abuse. Back to politics. Did you realise that Mr Darling’s Capital Gains Tax reforms have abolished the indexation allowance for CGT (so you will be taxed on the increase in cash - not real - value of an asset, including if the value has gone down in real terms), and that in fact - like the last budget - they hit the poorer members of society hardest? More on that later when today’s Working Lunch is available online. Except for the most important thing: a really big thank-you to everyone who has visited, and especially those who have taken the trouble to link to the blog or participate in the debate here. Your presence is very much appreciated, especially if you disagree with what has been written and help generate a wider debate. A wider debate is a worthwhile reason for putting hundreds of hours into building a blog. Tags: number crunching, new year 2008, 20000 uniques, record month, political blogging, wardman wire, matt wardman

Comments Off

Jan 25 2008

Daily Roundup: Which stories are people interested in.

[Update: the urls have oberstretched the template, so I’ll edit the article over the w/e to make it more readable - my apologies. Matt]. I have a script installed on the Wardman Wire which allows me to count which links are being used to leave the site. It doesn’t get used on everything, but when I need to monitor how popular links are, it allows me to do so easily. A good example where I use the script is in keeping track of how many people of clicking on “sponsor” adverts. These counts were reset at New Year. The “76″ is the average (mean) since I installed the script several months ago - the figure is much lower than the current figures as I didn’t start using the facility widely until January. How does it work? It works in the usual way for such scripts - by doing a “bounce” via a location that records the click in a file, and the counts can be viewed in a web browser. The one I use is called Click Manager (this link is redirected so I will know how many of you go to look), and has been around for a few years. Until the time of writing of this article, there have been 319 “out-clicks” today (Ok - yesterday - when this article was written), mainly from the Daily Roundup. The counts get reset every month. So what is Popular in the Roundup on Thursday 24th January? I’m not going to look at this in detail article by article, but these were the statistics from today’s roundup. This is a straight (rather crude) dump of the links. They are not in the same order as the article, because the Link Index Number is assigned when the link is first clicked. 437 links which bounce through the script have been clicked since New Year. These are the numbers from today’s roundup: The three fields are: Link Index Number, Link itself, Number of Clicks. 422 http://commentisfree.guardian.co.uk/larry_elliott/2008/01/the_fed_moves_but_is_it_too_la.html 21 423 http://commentisfree.guardian.co.uk/dan_kennedy/2008/01/googling_the_new_york_times.html 21 424 http://news.bbc.co.uk/1/hi/uk_politics/7206137.stm 20 425 http://www.guardian.co.uk/business/2008/jan/24/housingmarket.jsainsbury 20 426 http://www.dailymail.co.uk/pages/live/articles/news/news.html?in_article_id=510054 16 427 http://www.nytimes.com/2008/01/24/sports/othersports/24mask.html?_r=1& 16 428 http://www.timesonline.co.uk/tol/news/world/middle_east/article3238615.ece 22 429 http://www.timesonline.co.uk/tol/news/uk/health/article3238697.ece 21 430 http://business.timesonline.co.uk/tol/business/money/tax/article3241475.ece 22 431 http://www.economist.com/obituary/displaystory.cfm?story_id=10530041 20 432 http://news.bbc.co.uk/1/hi/scotland/7205623.stm 21 433 http://www.bbc.co.uk/mediaselector/check/player/nol/newsid_7170000/newsid_7171300?redirect=7171353.stm 16 434 http://news.bbc.co.uk/1/hi/england/bradford/7204543.stm 21 435 http://www.bbc.co.uk/mediaselector/check/player/nol/newsid_7170000/newsid_7171300?redirect=7171353.stm&news=1&nbwm=1&nbram=1&bbwm=1&bbram=1 3 436 http://www.nytimes.com/2008/01/24/sports/othersports/24mask.html?_r=1&hp&oref=slogin 4 437 http://www.dailymail.co.uk/pages/live/articles/news/news.html?in_article_id=510054&in_page_id=1770 5 And the Conclusions? The one conclusion that jumps out is the clear unpopularity of the last three stories in the list. People were not interested in: 435: The BBC report about blogging in Wales. It is an older link, so perhaps you have all seen it. 436: Olympic teams and the Smog in Beijing. 437: George Soros’ opinion of our prospects for recession. And what about Wednesday 23rd January? Yesterday there was a contrast. Here is the list of links: The three fields are: Link Index Number, Link itself, Number of Clicks. 400 http://commentisfree.guardian.co.uk/richard_adams/2008/01/slasher_flick.html 17 401 http://news.independent.co.uk/uk/legal/article3362252.ece 36 402 http://blogs.telegraph.co.uk/politics/brassneck/jan08/red_blue_swing_boroughs_of_london.htm 17 403 http://www.timesonline.co.uk/tol/comment/columnists/magnus_linklater/article3234441.ece 17 404 http://www.timesonline.co.uk/tol/life_and_style/food_and_drink/article3204370.ece 19 405 http://news.bbc.co.uk/1/hi/uk_politics/7203740.stm 18 406 http://news.bbc.co.uk/1/hi/uk_politics/7203421.stm 18 407 http://www.ft.com/cms/s/0/b5e4c196-c938-11dc-9807-000077b07658.html?nclick_check=1 18 408 http://www.guardian.co.uk/international/story/0,,2245036,00.html 2 409 http://education.guardian.co.uk/higher/news/story/0,,2245216,00.html 2 410 http://www.guardian.co.uk/business/2008/jan/23/marketturmoil.interestrates2 9 411 http://news.independent.co.uk/business/news/article3359122.ece 9 412 http://www.dailymail.co.uk/pages/live/articles/news/news.html?in_article_id=509693&in_page_id=1770 2 413 http://news.sky.com/skynews/picture_gallery/0,,91221-1301614,00.html 2 414 http://business.timesonline.co.uk/tol/business/economics/article3229659.ece 9 415 http://news.bbc.co.uk/1/hi/scotland/7202105.stm 9 416 http://news.bbc.co.uk/1/hi/wales/7202364.stm 9 417 http://news.bbc.co.uk/1/hi/wales/7202600.stm 9 418 http://www.guardian.co.uk/international/story/0 8 419 http://education.guardian.co.uk/higher/news/story/0 7 420 http://www.dailymail.co.uk/pages/live/articles/news/news.html?in_article_id=509693 7 421 http://news.sky.com/skynews/picture_gallery/0 7 And the Conclusions? Two facts stand out: Firstly, one story - number 401 - was twice as popular as anything else. This was an obscure legal story buried in the Independent about the current negotiations for fees for lawyers. Frankly, I don’t understand why that was the popular one (except that bloggers are either lawyers, jealous of lawyers, pitiers lawyers, or money-grubbing nerds). Or perhaps M’Learned Friends from Mr Usmanov’s lawyers Schillings were reading. Secondly, these figures divide into two plateaus: 400-407 inclusive had - 8 stories - had a total of 160 clickthroughs. 20 each. 408-421 inclusive - 14 stories - had a total of 87 clickthroughs. Just over 6 each. That is a dramatic difference, and it is due to my breaking of the article into an excerpt and a continuation. The excerpt appears on the front page, and readers are required to click to read the rest of the article. That implies that a roundup should comprise a slightly smaller number of news articles (which saves time anyway), and that (at least for this type of article) it should not be split. Finally, a number of articles were not popular at all: 408: Congo conflict causes 45,000 deaths a month: study 409: Top universities fail to spend ?3m set aside to attract poorer students 412: Amazing photos from Nasa probe reveal mystery figure on Red Planet 413: Consultancy Deloitte has predicted the big technology talking points for 2008. I cannot see a pattern here. Any suggestions? Wrapping Up These are my statistics - do you have any comments or comparisons? Tags: aardwark click-counter, count clicks, measurement, politics, money grubbing, alisher usmanov, schillings

Comments Off

Jan 24 2008

Independent Website Update: Some Numbers

To update my earlier notes about the Independent Newspaper’s web redesign having broken all the hyperlinks to it on my site - including the ones from yesterday morning that sent them 50 visitors. Having checked, I have sent the Independent something like 400-500 visitors in the last 2 weeks or so - just on links that I am monitoring. And we are only a C+ list blog on a good day. Heaven knows how much traffic they get from the likes of Political Betting, and the other sites with 8-10 times our unique visitors. Mike Smithson’s post there yesterday based around a link to the Indy had 364 comments on it, and he gets around 50,000 page views a day - compared to our 4,000-7,000. This is seriously not clever. Not preserving permalinks (i.e., web addresses of articles) like this is the most effective way to - how do I put this strongly enough to get the point over - fuck your own website, short of deleting it altogether. Among other things, It costs traffic. It costs Technorati rank (even though some people don’t care about it - I am not amongst them). And it costs Google backlinks (so searches go to blank places until Google deletes the link. Broken permalinks happen to nearly every blogger who moves from the Blogspot service to their own domain, and for bloggers it normally takes between 3 and 6 months of promotion to get back to where they were before in profile. Most political bloggers know next-to-nothing about the technical side of the internet - so they at least have an excuse; national newspapers do not. Come on Indy. Get your arse in gear and sort this out by Monday. The Times gets it right. I would make a comment on the new Not the Spectator Coffee House Blog, but the whole site is down. There won’t be any more links in my roundups until I know what you are doing. Chicken Yoghurt has also covered this. Tags: independent, newspaper publishing company, open house blog, permalink

Comments Off