A client provides you raw traffic files for a web site. Google analytics has some great interpretive tools to get you started – some of the most interesting come close to one of the ‘idealized’ goals of web site information architecture analysis. Which one? The suite of tools that determine click paths through a site. Following a User’s tracks through the site, determining entry and exit points.
This is passive information, literally it is the past. Aside from usability testing and focus groups and my mom, you have no model for determining how a given change in the Info Architecture, layout, etc etc will alter user pathways through a site.
Ultimately it is these paths you are trying to predict, steer, and design to.
Where does pattern mining come in? It’s a term basically that separates work associated with modeling and predicting patterns from revealing them. In machine learning there are a host of tools designed to first categorize (mine), organize (recognize patterns), then predict the continuation of those patterns.
But there is more than strictly linear extrapolation. Is a User on your site clicking on a $500 watch because they have a budget of $500 and want to buy a gift, because they want a new watch, because they were referred? If they click on another, much less expensive watch – it may be the ‘need a watch’ option, in which case you may have enough of a pattern to model what else they would click on were they to spend the next 24 hours on the site checking out the whole database. instead, a pattern predictor could take this step for you and present the options immediately. This is passive pattern recognition and the intereation between the pattern predictor and the User can be recursive. Is it advisable to present several options to assist in tuning the predictor, “Are you looking for a gift? Are you looking for a new watch? What is your budget?”. Ask Clippy. Maybe in some cases, but in most, passive observation is the best Machine Learning tool.
Ultimately a good predictor is not just for shopping experiences – as design tools they can provide objective, empirical methods of guiding new Information Architectures. Different pattern predictors trained from different mined patterns in web site traffic data effectively can act as simulated Users.
Is this straightforward? No. It may involve natural language processing and other intangibles. An ongoing issue in machine learning is that the learning algorithm may not be learning what you expect. But if these issues are carefully addressed , it may be possible to use web site traffic data to train algorithm to act as real time beta testers. imagine laying out an information architecture where the click-through percentages varied in real time as you designed, changed position and sizing of wire frame elements!