My Photo

WAA

  • Join the WAA

    Web Analytics Association MemberI'm a member of the Web Analytics Association. If you're not a member, JOIN TODAY!

Your email address:


Powered by FeedBlitz

Recently on this blog
Recently on other blogs

Add to Social Networks, Blog, Bookmark

AddThis Social Bookmark Button

Clicky

  • Clicky Web Analytics

Data Integration

March 27, 2008

Where to Put Integrated Data: 7 Helpful Questions

Web analytics data integration goes both ways.  When you marry clickstream data with other business data, you can put the combined result either inside or outside your web analytics application.  The trick is, if you can put it either place, how do you decide which place is best? 

Here are 7 questions to consider as you make your decision:

  1. Is this a once-off or will you need an ongoing feed?  Say you're working on a deep-dive analysis project, or you're preparing a data set to use for data mining.  You're probably pulling activity from a discrete period of time.  If so, integrate outside your web analytics application, where there's less overhead for a one-time task.  If, on the other hand, you're going to want this integrated data to be available at moments notice for all eternity, you're best off integrating wherever you can most easily automate your feed, which brings me to my next point:

  2. How much effort will it take to automate, in vs. out?  Call me lazy or call me practical, sometimes the right answer is the the easiest one (that's Occam's Razor, right?). The major commercial web analytics vendors have built-in integration tools, like Coremetrics Connect and Omniture Genesis.  If the data you need to integrate falls within the realm of what your web analytics application can handle, use the wizard and take a feed in.  If, on the other hand, you want to integrate custom data that's not wizard-able, take a feed out instead - but make sure you've got IT resources to help you automate the load into the destination system.

  3. Which analysis tools do your data consumers prefer to use?  Maybe you've got a favorite data visualization application (like Tableau), or predictive modeling software, or another business intelligence tool that people at your company like to use.  Yes?  Then integrate your data in a place where it will be easy to get at using that tool, most likely outside your web analytics application.  If you plan to use Excel you have more of a choice, because most web analytics vendors have Excel plug-ins.  You could integrate within your web analytics application and then feed it to Excel, or, if your data set is small enough, you could integrate by VLOOKUP()-ing right there inside Excel.

  4. Are your data consumers already active users of your web analytics application?  If so, you'd be doing them a favor by putting the integrated data where they're most likely to use it.  On the other hand, if they spend all day working with some other business data system, put it there instead.  It could be the factor that determines whether the integrated data ever gets adopted in practice by the people who are expected to use it. 

  5. Will you need reporting components that web analytics applications handle especially well, like browser overlay and pathing?  This will depend on whether your web analytics application actually lets you display integrated data in browser overlay and pathing reports.  If so, and if you can imagine actually using these reporting components, try to integrate inside your web analytics application.  Although web analytics applications are not as robust, generally speaking, as other data analysis tools, they manage to do a good job of presenting clickstream-specific data. 

  6. Are you hoping to integrate data that can actually be gathered at collection time?  Maybe the extra business data you want to integrate is something you'll be able to assign to a custom variable in your web analytics application at collection time.  If so, you'll be able to integrate without any after-the-fact joining.  If your integration data doesn't surface until further downstream, though, you can't use this approach.

  7. Do you need to store your integrated data behind the corporate firewall?  This isn't so much a technical issue as a legal one.  If the data you want to integrate involves personally-identifiable information and you're using a hosted web analytics solution, go re-read your site's privacy policy.  Chances are you will need to store the integrated data behind your own corporate firewall.  If you host your own web analytics application on-site you may still be able to integrate inside it, otherwise you'll need to pull a feed out.

So, depending on your situation it's perfectly reasonable to join data in both directions - inside and outside your web analytics application.  Strive to find a solution that's practical, easy, legal, and most likely to make your data analysts happy.

December 10, 2007

More Data Integration in the Cards for Google Analytics?

I want to see everything all in one place.  That's the mantra of data integration.  That's also been the theme of 2 interesting blog posts in the past week, both with a focus on Google Analytics.

First up, Justin Cutroni of EpikOne speculates on the future evolution of Google Analytics.  A good chunk of his discussion covers the integration of advertising cost data from various sources in Google's sphere of ownership.  AdWords is the shining example; it's already there in Google Analytics and I believe it's a big selling point for the product.  [um, can I say "selling point" if the product is free?]

SEM cost data is just the beginning.  Justin thinks we'll also see print ads and audio ads and banner ads and whatever else Google decides to sell also show up in Google Analytics before too long.  His vision involves the automatic creation of offline advertising vanity URLs so they'd be tracked - snap - just like that in Google Analytics.  It's already possible to do this manually but automating it would definitely take some of the mystery and confusion out of the process.  Plus the cost data would be integrated, and that's the whole point.

My main takeaway from Justin's post was that if Google owns it, Google can and should and will integrate it into Google Analytics.

So is Google the center of the universe?  Maybe.  Does Google own all the data we could ever hope to integrate with with our web activity?  Absolutely not.  In the end, I believe Google will be able to provide seamless, foolproof tracking for whatever slice of the universe Google owns, but there's certainly the potential for further data integration beyond what Google officially offers up.

Case in point, Michael Whitaker of Monitus, who just this week wrote about his own Analytics Fox extension for Google Analytics.  This product uses keywords as the integration point to pull extra search data into Google Analytics; the integration happens in the browser, not at Google.  It is meant to be used just by people who operate Yahoo! Stores - obviously not Google property.

What do you need to integrate?  Could you hack it with Google Analytics and a display-level mashup?  If so, could this be a lightweight alternative to Omniture Genesis?

In conclusion I believe the future of Google Analytics data integration is twofold: officially, Google will give what Google deigns to give, and unofficially, third parties like Monitus will continue to develop useful extensions of Google's offering.

November 02, 2007

Rants and Raves about WebTrends Visitor Intelligence

Three weeks ago I attended the WebTrends Engage conference; not long thereafter I wrote this chatty post about the great party, my penthouse suite, and then a little about one of the new WebTrends products, Score.  Here it is, 3 weeks later, and the commercial web analytics vendor world has undergone a major upset - WebTrends has given 4 top execs the boot in the wake of Omniture's acquisition of Visual Sciences.  It will take a while for the dust to settle.

Current drama aside, I'd like to share my thoughts on WebTrends Visitor Intelligence (VI).  At Engage I sat through a couple of hours of live demos as well as a small-group private session.  Overheard while I was there: 

"Most marketers wouldn't know what OLAP was if it jumped up and bit'em in the butt."

Is that so, marketers?

ViVI is an OLAP tool.  More precisely it's a browser-based drag-n-drop reporting interface for visitor-level web activity data stored in a relational database, aka WebTrends Marketing Warehouse. 

If you choose to integrate external customer-level data into Marketing Warehouse you can pull attributes like gender and income and age - and whatever else you know about your customers - into VI.  Can you imagine being able to easily drag gender or income or age into any web activity report, either as a dimension or as a filter?  Can you imagine being able to easily drill all the way down to the visitor level on any report?  Isn't this flexibility something we all want?

The VI and Score live demos at Engage were packed to capacity, even at the end of a busy conference day in a city full of recreational distractions.  Based on that observation I'd say, yes, there's considerable interest in a) WebTrends' new products, and b) visitor-level web activity data in general.

Queryable visitor-level web activity data is nothing new.  As an industry we've been doing this for years, but it's been the exception rather than the rule.  If there's any hope of bringing visitor-level detail into mainstream use, I think WebTrends has made a valiant effort with VI.  If, it turns out, it's simply too expensive and too cumbersome to ever gain traction, why on earth do we keep trying?  When are we ever going to get it right? Answer me that, vendors (when you're not busy cannibalizing each other).

Although I don't want to go too deep into the nitty-gritty details of VI, I do want to mention a few likes/dislikes:   

Like

  • Available as both on-site and hosted solutions.  If you've got privacy concerns, great, get the on-site solution.  If you lack IT resources, great, get the hosted solution.
  • Relational database back end, making external data integration possible.
  • Simple Excel export from VI - no SmartReports.  Right on.  SmartReports aren't worth the paper they're printed on.

Dislike

  • WebTrends Analytics will continue to exist in parallel with VI.  There are some reports you can only get in VI and some reports you can only get in Analytics.   Sounds confusing.
  • I don't believe VI will meet complex segmentation needs, especially when it comes to path-based questions like, "Give me the visitors who did x and then later did y."
  • SDC page-tagging is a requirement for VI - so VI is not an option if you use logs as your data source.

VI is a great start; I would love to see it get adopted and successfully used.  My fellow bloggers Jacques Warren and Anil Batra also attended Engage and have posted their reviews of VI.   Aaron Gray and OX2 have both mentioned involvement in VI proof-of-concept projects; I look forward to hearing more about their experiences.

October 01, 2007

Where We're at with Data Integration

I recently had the opportunity to ask 20 or so fellow web analytics practitioners what they're doing with respect to data integration.  Although each answer was as unique as the person who spoke, there was a common theme:  we want to do more in the future than we are doing today.

The few who are now farthest along in data integration efforts are all doing it in-house, with real data warehouses and technical teams who manage the underlying hardware and software and load processes.  Others are struggling to get systems like this up and running, all the while wondering if there is a better way.  The vast majority of us are sitting on the sidelines knowing data integration is something we want to do eventually but not knowing quite where to start.

Web analytics vendors have begun to address this common desire: see, for instance, WebTrends Visitor Intelligence and Omniture Genesis.  As hosted services, such products could make data integration possible for those of us who are not willing or able to maintain our own technical infrastructure.  If it's not the final answer, at least it's a step in the right direction. 

Assuming we get our tool issues sorted out, we'll still have people issues to contend with.  Data integration almost certainly involves the coordinated efforts of several groups within an organization.  Sometimes our counterparts don't want to share - for political reasons, or privacy reasons, or who knows what.  Getting everyone to work together may end up being the biggest challenge we'll face.

Yes, folks, that's what we're up against.  And yet, it's something we all want and I believe it's something we'll manage to do in the months and years to come.