My Photo

WAA

  • Join the WAA

    Web Analytics Association MemberI'm a member of the Web Analytics Association. If you're not a member, JOIN TODAY!

Your email address:


Powered by FeedBlitz

Recently on this blog
Recently on other blogs

Add to Social Networks, Blog, Bookmark

AddThis Social Bookmark Button

Clicky

  • Clicky Web Analytics

Tools

June 04, 2008

Tracking Downloads with Page Tags, Just the Basics

I offer a useful tidbit for those of you who 1) rely on a page-tagging solution for web analytics, and 2) care about tracking downloads.  I've encountered this situation several times as a web analyst, so I thought I'd write up a basic summary of the issue. 

You will find this relevant if you are either a newbie web analyst or an experienced web analyst charged with educating your data consumers.  Here's what you need to know:

Page-tagging

Many popular web analytics tools available today – like Omniture and Google Analytics – rely solely on JavaScript tags for data collection.  Using this method, site owners place a tag on every page they wish to track; when a visitor accesses a tagged page, information is sent back to the web analytics tool.  That, in a nutshell, is page-tagging.

Before you read any further, confirm the method of data collection used on your site.  If you've got something other than page-tagging - such as log files, or network collection, or a hybrid solution - then you can stop reading now.  The issue I'm about to describe only matters if you use page-tagging alone for data collection.

Tracking downloads

Here's the problem with using page-tagging to track downloads: it's not possible to embed a JavaScript tag inside a downloadable file.  Instead, the tag tracks the very moment the visitor clicks a link to get a file.  This counts downloads initiated, not downloads completed. 

Measuring downloads initiated is not wrong, but it is slightly upstream from the spot we'd ideally like to track: successful completion of download.  Marking this slightly-upstream-but-still-valid action will necessarily inflate the number of real downloads, though, since it’s entirely possible to back out of the process before actually downloading the file 100%.  

What it all means

Do not expect accounting-level precision from tag-based download stats.  Take the metric at face value, and make sure everyone who uses the data understands how to interpret it.  If you find yourself needing to compare tag-based downloads with an overlapping source of business data, I invite you to read my data reconciliation how-to guide post.

When I encountered this issue with downloads most recently (last month, in fact), my client opted to put a statement like the following as a footnote in a widely-distributed report that includes download stats:

"Values in this report approximate the number of successfully completed [Product X] downloads. Since no status is returned to the server when a download completes, it's not possible to get an exact figure. Therefore, we count downloads initiated."

So, armed with education and footnotes, you and your fellow analysts should feel confident using the "download initiated" value as a valid marker of site success.

Extra credit

Here's one of my favorite pieces by early photographer Edweard Muybridge.   As progressive snapshots of a single activity, it seems appropriate to include it here.

Muybridge

Image credit, Digital Journalist.

March 27, 2008

Where to Put Integrated Data: 7 Helpful Questions

Web analytics data integration goes both ways.  When you marry clickstream data with other business data, you can put the combined result either inside or outside your web analytics application.  The trick is, if you can put it either place, how do you decide which place is best? 

Here are 7 questions to consider as you make your decision:

  1. Is this a once-off or will you need an ongoing feed?  Say you're working on a deep-dive analysis project, or you're preparing a data set to use for data mining.  You're probably pulling activity from a discrete period of time.  If so, integrate outside your web analytics application, where there's less overhead for a one-time task.  If, on the other hand, you're going to want this integrated data to be available at moments notice for all eternity, you're best off integrating wherever you can most easily automate your feed, which brings me to my next point:

  2. How much effort will it take to automate, in vs. out?  Call me lazy or call me practical, sometimes the right answer is the the easiest one (that's Occam's Razor, right?). The major commercial web analytics vendors have built-in integration tools, like Coremetrics Connect and Omniture Genesis.  If the data you need to integrate falls within the realm of what your web analytics application can handle, use the wizard and take a feed in.  If, on the other hand, you want to integrate custom data that's not wizard-able, take a feed out instead - but make sure you've got IT resources to help you automate the load into the destination system.

  3. Which analysis tools do your data consumers prefer to use?  Maybe you've got a favorite data visualization application (like Tableau), or predictive modeling software, or another business intelligence tool that people at your company like to use.  Yes?  Then integrate your data in a place where it will be easy to get at using that tool, most likely outside your web analytics application.  If you plan to use Excel you have more of a choice, because most web analytics vendors have Excel plug-ins.  You could integrate within your web analytics application and then feed it to Excel, or, if your data set is small enough, you could integrate by VLOOKUP()-ing right there inside Excel.

  4. Are your data consumers already active users of your web analytics application?  If so, you'd be doing them a favor by putting the integrated data where they're most likely to use it.  On the other hand, if they spend all day working with some other business data system, put it there instead.  It could be the factor that determines whether the integrated data ever gets adopted in practice by the people who are expected to use it. 

  5. Will you need reporting components that web analytics applications handle especially well, like browser overlay and pathing?  This will depend on whether your web analytics application actually lets you display integrated data in browser overlay and pathing reports.  If so, and if you can imagine actually using these reporting components, try to integrate inside your web analytics application.  Although web analytics applications are not as robust, generally speaking, as other data analysis tools, they manage to do a good job of presenting clickstream-specific data. 

  6. Are you hoping to integrate data that can actually be gathered at collection time?  Maybe the extra business data you want to integrate is something you'll be able to assign to a custom variable in your web analytics application at collection time.  If so, you'll be able to integrate without any after-the-fact joining.  If your integration data doesn't surface until further downstream, though, you can't use this approach.

  7. Do you need to store your integrated data behind the corporate firewall?  This isn't so much a technical issue as a legal one.  If the data you want to integrate involves personally-identifiable information and you're using a hosted web analytics solution, go re-read your site's privacy policy.  Chances are you will need to store the integrated data behind your own corporate firewall.  If you host your own web analytics application on-site you may still be able to integrate inside it, otherwise you'll need to pull a feed out.

So, depending on your situation it's perfectly reasonable to join data in both directions - inside and outside your web analytics application.  Strive to find a solution that's practical, easy, legal, and most likely to make your data analysts happy.

Urchin Sticker Followup

My vintage Urchin sticker giveaway has ended.  Last week I mailed envelopes - literally all over the world - to 10 lucky recipients.  It was a fun way to meet my blog readers, so thanks for participating.

Here's a sticker I kept for myself, stuck to my home laptop:

Urchin_laptop

March 16, 2008

Vintage Urchin Sticker Giveaway

After reading Lars Johansson's entertaining post on Web Analytics Memorabilia I immediately went for a dig in my sticker pile. [Does everyone have a sticker pile, or is that just me?]

Story: Back in 2001 I worked for a first-generation web analytics vendor called WhiteCross Systems.  One day, while researching the competition, I found a vendor called Urchin.  They were giving away stickers on their web site, and I like stickers, so I sent away for some.  Two weeks later the stickers arrived.  I promptly tossed them in my sticker pile and forgot about them for 7 years.

Fast-forward to 2008.  Urchin has become Google Analytics.  My stickers have acquired retro-cool appeal.  I've been hoarding them long enough, so if you'd like a sticker, write to me and I'll mail you one.  I have 10 to give away.

[Update, March 19th: My stickers are all gone!  The giveaway has ended.]

Urchin

March 03, 2008

A Case for the Omniture Implementation Toolkit

You may have heard that my company, Semphonic, recently published the Omniture Implementation Toolkit.  By way of a story - my own "How I learned Omniture" story -  here's why I believe it's a worthwhile investment:

Back in 2005 I was doing contract work in web analytics.  I'd just landed a gig at a big company in Silicon Valley.  They'd had a plain-vanilla Omniture tag on their site for a couple of years, but they weren't getting much return on investment.  They hired me, I think, just so they'd have someone around who gave a sh*t about the data.

I had been using web analytics tools for a respectable 5 years by that point, mostly logfile-based custom data warehousing solutions.  This company brought me on board knowing full well that I had never used Omniture before, that in fact I had never used a page-tagging solution before.  I still feel fortunate that they hired me with faith in my ability to pick up new skills on the job.

I spent the first few days at my new workplace learning people's names, learning the org chart, learning the unavoidable dialect of cryptic business acronyms.

Sometime during that settling-in period one of my coworkers appeared at my cube, silent, holding a giant ream of printer paper.  He dropped the unbound stack - bam - from about 2 feet up, onto my desk.  The look in his eye could only mean, "Good luck with this.  You're gonna need it."  He turned and left.  I read the stack's cover sheet: "Omniture."

What did I do next?  Well, I read that whole pile of documentation.  I bookmarked the knowledge base.  I memorized the support hotline number.  I figured out how to modify the tag and test my changes.  I made mistakes, I fixed them, and I learned.  Eventually I found ways to connect my colleagues with meaningful data they could actually use.

Granted, I did all of this without the Toolkit.  But if it had existed then, it would have been a valuable supplement to Omniture's official literature.  And, since it's based on Semphonic's collective experience actually doing successful Omniture implementations, I know it would have saved me the trouble of learning all the details the hard way on my own.  I could have gotten through my trial-and-error phase faster, and I could have made informed implementation decisions with more conviction from the start.

This is basically what Gary Angel said when he made the baseball analogy in his recent blog post.  My story is just one concrete example of how, when, and by whom the Toolkit ought to be used.  You can learn more about the Toolkit here.

I will be at the Omniture Summit in Salt Lake City on March 4-7.  If you see me there, I invite you to tell me your own "How I learned Omniture" story.

December 31, 2007

Flickr Has Stats! One New Way to Measure User-Generated Content

Photo-sharing site Flickr has launched a great new feature: detailed web activity statistics for photos.  Check their FAQ for activation instructions. 

As individual participants in user-generated content sites we want to measure the attention we get.  You know it's true.  How are people interacting with our content?  And how do they find us in the first place?  Flickr Stats has the answers.

Viewing, commenting, favoriting - such activities can be used to gauge the relative popularity of photos.  It has been possible to get this information all along, but now it's in a central location and available as time series data (in a very Google Analytics-looking interface).

Flickr_views

In addition to content interaction data, Flickr Stats exposes, for the first time, referring URLs and keywords that bring people to photos.  This is where I think the real benefit will be seen.

Keyword reports give content owners new insight into the relationship between tags and image searchability, in turn encouraging photo SEO.  If someone wants their images to be found they will spend more time creating meaningful tags, and better tagging means better metadata for everyone.

So thanks, Flickr, this is great!  Just one more reason for me to justify that $25/year pro account.  I hope other user-generated content sites follow suit and expose measurement, too.  There's plenty of room for meaningful metrics wherever individuals are sharing content.