My Photo

WAA

  • Join the WAA

    Web Analytics Association MemberI'm a member of the Web Analytics Association. If you're not a member, JOIN TODAY!

Your email address:


Powered by FeedBlitz

Recently on this blog
Recently on other blogs

Add to Social Networks, Blog, Bookmark

AddThis Social Bookmark Button

Clicky

  • Clicky Web Analytics

« March 2008 | Main | May 2008 »

April 2008

April 28, 2008

Speaking, Listening, Partying, Learning and Otherwise Engaged at eMetrics SF

The eMetrics Marketing Optimization Summit happens May 4-7 and I'm so looking forward it!  Although the conference is practically walking distance from home I still think of it as a Big Trip.  Here's what I'm excited about:

Hearmespeak2 I get to speak.  James Gardner and I will do a modified reprise of the presentation we gave at eMetrics DC last fall, and we'll be sharing the hour with the inimitable Dylan Lewis and Mark Brooks.  The four of us have worked out a nicely-coordinated set of talks focused on career development and staffing in web analytics.  Read about our track.

I also get to listen.  With gusto I will attend as many sessions as I can.  After the conference I'll do a write-up on my blog.  If you just can't wait that long I may also post some juicy tidbits on Twitter (tsk, Twitter) so feel free to follow me.

And I am throwing a big party. Have you not heard?  Web Analytics Wednesday happens on May 6th and you're invited.  Muchas gracias to Eric T. Peterson, David Rogers, our fine sponsors (Coremetrics, SiteSpect, ForeSee, Tealeaf and eMetrics), our super volunteers and every single participant.  Sign up now if you haven't already. We already have more than 160 web analytics professionals signed up to attend!

If you're attending eMetrics and you'd like to meet, please introduce yourself - I still look like the picture on my blog, except I have a new short haircut.  If you are in the vicinity but you can't spring for an eMetrics ticket, don't despair!  There are 3 associated free activities during the week: Web Analytics Wednesday, the WAA Raucous Caucus, and the eMetrics Expo-Only pass.

Here's to eMetrics.  It's going to be a busy week but a lot of fun.

April 23, 2008

Web Analytics Data Reconciliation How-To Guide

I suspect that most experienced web analysts have done at least one data reconciliation project during the course of their tenure.  For something so common, however, it rarely gets discussed. 

Sure, it's not sexy like Angelina Jolie, but even Plain Jane likes a little attention now and then.

Data reconciliation is an important foundational activity because, when done well, it will inspire people to have confidence in the data that you share with them. Data quality will never be perfect, but it should be good enough for everyone to feel that they can make sound business decisions based on what's available.

Enough pep talk.  If you're on the brink of your first data reconciliation project, here's what to do:

1) Identify your two data sources

The need for data reconciliation arises when you have two separate systems that provide similar sets of data.  One of these sources - let's call it "Primary Source" - will necessarily be your standard web analytics application.  The other one - let's call it "Secondary Source" - can be one of several things, namely:

  • An upstream system, like campaigns (banner, search, email)
  • A downstream system, like commerce or downloads or form submissions
  • A parallel system, such as when you migrate from one web analytics tool to another.  In this case I'd advise you to break your project into smaller chunks according to individual reports you care to reconcile.

2) Learn how your primary source gets collected

Read the documentation and talk to your internal tech team.  Be clear on the scope of the data you're collecting - ie exactly which pages are tagged, or exactly which log files are processed.  If you're using page tags, know whether the tag is placed at the top or the bottom of the page (this will affect when the tag fires, which in turn affects the level of data loss to some extent).  Make note of any special filters, transformations or business logic used here.

3) Learn how your secondary source gets collected

If your secondary source is a parallel web analytics system, repeat the process you followed in step 2, above. 

If it's an upstream system you're stuck with whatever documentation and lore you can glean regarding how that works. 

If it's a downstream system you'll need to identify the group within your business that owns that system, then grill them on how they do data collection and how they transform the data into the metric you're trying to reconcile.  There's a lot of variability here, especially if your downstream system is homegrown, so be sure to do a thorough investigation.  As in step 2, make note of any special filters, transformations or business logic used here.

4) Compare data sets

Applesoranges Pick a sensible date range and granularity level, then pull corresponding data from both sources.  A good default would be daily totals for a month.  If you're dealing with really high volume you may want to isolate a subset of your data based on some attribute that you can reliably pull from both sources, like a single URL (for downloads) or a single product (for commerce).

Now put your data sets side by side in Excel and calculate the delta.  Compare the trends over time and see if you can explain the differences.  Ask yourself, are you comfortable with the differences you see?  If not, consider fine-tuning the way you pull data from your primary and/or secondary source in order to account for those differences.

Reality check: you're never going to get a perfect match.  This is a good exercise, but know when to say when. Do not obsess!

5) Document and share your findings

This is the most important step.  Write a report about what you've done and what you've found.  Now go talk to people - give a verbal presentation of findings to your web analytics colleagues and your concerned data stakeholders. 

At this point you should be able to speak with confidence about the differences in the two data sources, and your goal should be to pass this confidence on to the people around you.  Save your report for future reference, as newcomers are likely to ask the questions you've already answered.

6) Plan to revisit if necessary

If reconciliation is part of tool migration, you are now done.  Good work.

If your secondary source is an upstream or downstream system, plan a periodic audit to make sure your findings are still valid.  If your systems are stable you can get away with doing this maybe once a year, but if you have any appreciable changes - like a major site redesign or a shopping cart overhaul - you may wish to do another quick round of reconciliation at that time.

April 07, 2008

Podcast Interview on the Aquent Talent Blog

Matthew Grant of the Aquent Talent Blog recently conducted a podcast interview with me; I invite you to listen to it here.  If you haven't got time for the whole show, skip straight to the highlight at 11:45, where I somehow manage to divert our conversation from web analytics to goats.

Aside from that one sidetrack we did actually talk a lot about web analytics, and, in particular, web analytics careers.  You'll find the podcast interesting if you're a newbie web analyst, and I'd also recommend it if you're thinking about taking contract work in web analytics (as I'd done prior to joining Semphonic).

I met Matthew earlier this year through his Aquent colleague, James Gardner.  James and I gave a joint presentation at eMetrics DC last fall on career management for web analysts, and we'll do a reprise at eMetrics SF next month.