BlindConfidential: Eating an Elephant: Lost in the Supermarket

Monday, December 08, 2008

Eating an Elephant: Lost in the Supermarket

“I’m all lost in the supermarket, I can no longer shop happily, I came in here for a special offer, guaranteed personality…” The Clash

I briefly mentioned the supermarket accessibility problem in the first installment in the “Eating an Elephant” series but did so without describing the actual complexity of the issue and how I have no solution to proffer and, to the best of my knowledge, no one is researching this problem. I hope that Will Pearson sends in a comment on the matter as he is far more expert in aspects of this topic than me.

At a glance, the confusion of a supermarket effects sighted people as well as those of us with a vision impairment. The stores have thousands of products sorted by their similarity to other products with the exception of displays of items on sale and products receiving extra promotion. These categorized items are distributed into aisles which contain packages of differing size, shape, color and prominence based upon how high or low they sit on a shelf. The sighted person can grow overwhelmed at the sheer vastness of visual noise, the wide array of colors and the way marketing types invent packaging to mislead the consumer as to the size and/or shape of its contents.

The sighted person, although their attention might scramble a bit can, however, see that aisle four contains condiments and walk to it. While in this section, they can also see that Wish Bone salad dressing is discounted and that Paul Newman’s is not and make the choice as to which they would prefer. They can also easily find the highly recognizable Tabasco trademark bottle and the Progresso hot cherry peppers they enjoy on sandwiches. This sighted customer may also see a new product with a promotional cardboard thing pointed at it and choose to give it a try. They may also see an item they hadn’t thought about before making their list and pick it up on impulse.

The person with a profound to severe vision impairment, though, has an extremely different experience. As I described in part one of this series, the customer service people at the store assign us a human to help us with the shopping. These people vary in competence from illiterate to unable to speak a language I might understand even a little to very helpful. Even the best shopping companion, though, will start with the question, “So, what do you want?” A well prepared blink will have printed out a shopping list, the rest of us disorganized type are left to the wilds of the shopping experience.

Often, the answer to “What do you want to get?” is, “Lots of stuff.” This means that our companion has no clue where to start and we can only begin by rattling off items we definitely know we need.

Now, let’s return to the condiment aisle example we used for our sighted friends. In a manner of over simplification, we can imagine that each side of the aisle contains the same number of shelves and that each product has exactly the same amount of shelf space. For our simplified example, we can view each product and variation thereof as having its own cubicle. To keep the arithmetic simple, we’ll say that each side is five shelves high and 20 product cubicles per shelf. Thus, we have 10 products on each shelf - a massive simplification.

Like our sighted counterpart, we know we want salad dressing, Tabasco, maybe some Progresso cherry peppers (often the store is out of stock on these) and, like our sighted friends, we may want to try a new product or pick up an item on impulse.

So, we, the blind shopper is presented with 200 products and variations in the aisle and we may actually want to buy four or five items from this set. How can our companion or possibly some as yet not invented bit of technology provide us with enough but not too much information about the items in the aisle?

If our companion or technology simply tells us everything in the aisle, we will somehow need to try to hold 200 separate offerings in short term memory. This breaks the memory bank and the attention model all at once and such information overload can be discounted out of hand.

We can be told all of the categories of items in the aisle: salad dressing, hot sauce, ketchup, mayonnaise, pickles and peppers, mustard, etc. Again, we’ve a big list of items that have only a generic description and much of which we can recall from previous visits to the market. So, we’re now getting a combination of too much data plus redundant information and we still haven’t found our first item.

Like our sighted friend, we want some thousand island salad dressing. For this example we’ll say that I am especially fond of Paul Newman’s and don’t care about Wish Bone even if it is on sale. I can tell my companion to get me the dressing I want and disregard all competitors. If, however, I consider salad dressing generically, I may want the item on sale or even the Publix store brand to save a little money I need to tell my companion to list off the various brands and their prices – this is a boring and time consuming process that leads only to the selection of a single product.

The next item, Tabasco, is simple. I tell my companion that I want the sauce in the Catholic family sized bottle as I use a lot of it. The companion then asks, “Red or green?” I know I prefer the red but what if it was a product with which I was less familiar? Again, more time wasted determining which version of a single item I want.

The last two examples, a random item on sale and an impulse purchase provides the most complex of the problems. There are two hundred items in this aisle, n items have sale tags (where n is a value between zero and a random figure less than 200) and all 200 minus the salad dressing and Tabasco sauce may fall into the impulse purchase category. Once more, my companion can list all sale items, possibly a large number of items in a fairly large number of categories and to cover the impulse purchase, we need to return to the entire list minus the two items we’ve already selected.

Now, we can multiply our 200 items in the condiments aisle by the 20 aisles in the store and we have an incredibly overwhelming number of data points. Remove the constraints I placed on the number of items per aisle and we have a very complex distribution of stuff we may need or want to buy.

With a companion, reading everything or even every category blows past short term memory limits and any attention model I’ve ever seen described for human beings. How then can a human companion, far smarter than any technology that may be invented in the short term future, determine the balance between too much, too little and the Goldilocks amount of information the consumer with vision impairment needs and/or wants to hear.

Last week, as Susan, my lovely wife of 21 years, and I drove south from Cambridge back to our home in Florida we pulled off at an exit in South Carolina which had fast food joints on all four corners. Susan made the executive decision that we would eat at McDonald’s; she did not tell me that we had choices nor, of course, did she tell me which choices we had. One of the others was a Wendy’s, a crappy fast food place that I prefer over McDonald’s. Susan made the assumption that fast food was generic and that I wouldn’t care or even have an opinion on which I may prefer which, in this case, was a fallacious assumption. Susan has been married to a blink for 21 years and still hasn’t developed the knack of finding the proper middle ground level of information – how then can we expect a randomly assigned supermarket companion to have even the slightest clue what we do and do not want to hear.

The most frequently described technology possibility is based in RFID, a standard that has been due to replace UPC for a pretty long time. With something like an RFID wand, the blind consumer can hear the items that they are near. The user could turn such a device to “category” mode or “sale item” mode or any of a number of categories of information that can be held on the product’s RFID combined with augmentative data on the store’s Wi Fi system. I still think this will provide too much information in a manner too complex to be truly useful but it seems to be the best idea I’ve heard so far. The practicality, though, of getting every supermarket and product to retool their shop for such a system is probably not going to happen for a long time to come if ever.

What can we, as people with vision impairment, do to solve the supermarket problem in the time before someone invents and distributes a device that might solve the problem? The first suggestion is to shop online and have one’s groceries delivered. These online grocery services are not available in all parts of the US and, returning to the problem of the current screen reader UI paradigm of reading everything as a list, slogging through a web site with zounds of items will either take a really long time or will not do much to solve the sale item problem and little or nothing to help with impulse or new product purchases. This, of course, has the benefit of saving one some money by putting up a wall to our potential impulses but it also leaves out the ability to discover items we may really enjoy.

Do any BC readers have any suggestions? If so, please leave comments to further discussion.

-- End

4 Comments:

Anonymous said...: When I lived in Miami, we shopped at Publix Direct. This was an online shopping service that provided the same prices as the actual Publix Store and they charged $7.99 delivery fee. Publix said they canceled the venture because there wasn't enough demand. As far as impulse buying, it actualy gave you the ability to look for what you wanted or to browse by categories. There was a link for Specials and on-sale items. I spent about 10 minutes or less browsing this list and trying to see if anything matched the items on my current shopping list. Also, if you put in for example apple juice, it would show you the items that matched that including any specials and there was a link to jump in to the juice category, etc. These kind of services have worked in the western states but I guess us dumb Floridians aren't ready for online shopping. I loved the service. One nice thing was that I could load my previously saved shopping list and add/remove items which is where I started building my list. I'd be done shopping for a month in just under 45 minutes and delivery was scheduled usually 2 days later.
Patrick in Tampa; 6:24 PM
Unknown said...: Hi Chris. Speaking from the other side, since I'm now working in a supermarket.

1. Be a nuisance. You're a customer.
2. Working there I guess I have just as much trouble finding stuff as you, even if I am sighted.
3. Try the method my better half uses when shopping. Wonder down each isle. Home in on 'sauces'. Look for your thousand islands. Then scan either side for offers which are acceptable. It's a tree model, any old isle, an area in which I have interest, zoom down, look for alternates, select or reject. How to automate? I don't think RFID can do any better your sighted guide? Unless you help by selecting one level at a time? Aisle the next. Select top level, reads out sauces, cereals, kitchen goods etc. Once you've picked, it automatically goes down a level and lists again by smaller groups (perhaps maker? Kellogs, etc). Then you ask for offer alternates. Working with RFID I guess you could get somewhere. The other bit of action would be, I want Kellogs cornflakes. It says aisle 3. What? It says... I don't know. You want an alternate to satnav for that way of shopping, the person in a hurry approach? Whole new ball game.

4. Just cause you've been married a while, don't expect your partner to be perfect and know you want to select which fast food joint. When you're perfect, you can ask if she is! Hows that for a deal.

regards Dave P.; 6:12 AM
Anonymous said...: I’m going to put the question of how to get the information to one side and focus on the question of how to filter the information. Filtering the information is something that is often overlooked when it comes to accessibility, and it’s therefore the more interesting question from my viewpoint as a researcher, well, that’s what my job description says I am any way.

As most people will no doubt know people filter information by using the process of attention. What most people might not know is that attention comes in two flavours: endogenous attention, which is where we select objects to pay attention to, and exogenous, which is where we automatically pay attention to objects in the environment. In the case of visual attention these objects can be any size or shape as research has shown that visual attention works in a similar way to a zoom lens as we can alter the size of the area that we’re paying attention to in order to focus on groups of objects, a single whole object, or just a particular feature of an object. There are also differing views on whether we select to pay attention to the perceptual properties of objects, such as size, shape, colour, etc., or to what the object is, such as a dog, cat, or bird, and these two views are known as early and late theories of attention due to the differences in the amount of processing that we do on the information before we select to pay attention to something. All these factors are important when it comes to simulating attention.
Over the years there have been various psychological models that attempt to model the process of attention. One of the earliest ones, that was proposed by a guy called Broadbent, viewed attention as a single filter. More recent models, such as Feature Integration Theory by Treisman, can be thought of as extending attention to a system that uses multiple filters.

Probably the first filter that we all use is based on our location. Our location, the physics of light, and the human anatomy and physiology surrounding the visual system govern what visual information reaches us. So, we effectively use our location as a very coarse filter for incoming information. There are already systems in use today that simulate this to some extent. An example that quite a few people will be familiar with is the GPS solutions that are becoming fairly common place. These work by getting information about someone’s location and then restricting the information that they present to a user to just information about that location. So, if systems can get information about a person’s location they are able to filter information based on that location. As well as being able to use GPS to get a person’s location systems can also use technologies such as ultrasonices, for example the ultrasonics system that Henk Muller, Cliff Randell, Mike McCarthy,and Paul Duff developed as part of Bristol’s contribution to the Equator project.

When people go shopping they’re usually looking for a particular thing or type of thing. So, it makes sense to follow the late view of attention and filter the information based on what something is rather than what it looks like. To achieve this systems would need to incorporate some form of semantic tagging. For those of you with wit an interest in meta data and semantic tagging the work that my friend Emma Tonkin has done in this area is well worth a read. If we can tag information about an object to describe what that information is, such as a list of ingrediants, what type of container it is, etc. then systems can use multiple filters based on those tags. Systems would need to use multiple filters to come up with a good approximation of human attention, at least according to Treisman’s Feature Integration Theory.

The type of system that I’ve described so far would allow someone to restrict the information they got according to what information they felt they wanted to meet their goals. This would cover most of the scenarios found in shopping, as the majority of shopping is goal orientated with people going to buy specific things, but the systems that I’ve described don’t cover scenarios that aren’t goal orientated. Goal orientated scenarios usually make use of endogenous attention but other scenarios, such as buying new products or special offers can sometimes work by exogenous attention. That’s why you often find new products or special offers placed in prominent positions with their own displays that are big and brightly coloured. All of this is just intended to try to capture your attention automatically.

It’s possible to use semantic tagging to simulate exogenous attention to. In order to simulate it systems would just have to filter the information slightly differently. The key difference that the systems would need to incorporate would be a degree of automaticity in selecting the information that they presented to a user. Systems would need to be able to identify special offers or new products, which could be accomplished by using semantic tags, and automatically announce information about those products. Systems could still use a certain amount of filtering, such as filtering based on product type, but to create a good approximation of exogenous attention they really just need to filter based on whether something is a special offer or not or whether it is a new product or not.

Impulse buying is pretty much based on making the most use out of people’s attention as they’re just walking along. Part of it is due to exogenous attention, such as eye grabbing window displays, but another part is based on making the most use of where we naturally place our attention as we walk along. It’s pretty well known that store keepers pay careful attention to the items that they place at head height. This is because that’s where we tend to put our attention as we’re walking along. So, as we’re walking along we’re going to take notice of the items that are at or near to head height. People probably only filter the information based on what the item is to begin with but change what information they filter about an object once they’ve decided that they’re interested in it. So, if systems has information about the height of an item then this could be used to approximate the information that people use to make initial decisions about impulse buying.

So, any system that aims to aid blinks when they’re shopping would need to take all this filtering into account. Getting the information in the first place is an obvious problem, and the one that people seem to put all their energy into when it comes to accessibility, but filtering the information so that the user gets just the information that they want is equally important to the success of a system.

Will; 10:23 AM
Anonymous said...: As a lifelong blink, I think the RFID idea plus filter by category is actually quite cool.
For online shopping malls, though, I think spatial association is something stores could use. A non-shopping example is the online role playing game "The Kingdom Of Loathing", where the player map is basically a large table. Using my screen reader and either speech or a Braille display, I can use the map in a rather visual way, moving from the distant woods on the east, to the Plains, and within the plains to the Goblins on the west side of that plain, then out and across to Seaside Town on the west. Since I'm pretty new to the game, I don't know all the names of all the places there, even though I've read the wiki and everything. But I can get where I want to go and can see how the surroundings are laid out.
Another feature of this map is that the places you can actually get to show up as links, while all the rest shows up as either empty space, completed, destroyed or otherwise modified areas.
So what if an online store did this, so you go in, you know the freezer section is at the northwest, you know your beers are two aisles in from the east, etc. Then when you click on each square or table cell, a mini map shows up so you can categorically move around within it and make your selections.
A real-life example of this: I was a blind vendor for awhile and kept all my store items in an Excel spreadsheet, designed to look like the store.
I even had the coollers which opened towards the front for use and the back for stocking, put in separate work sheets that mirrored one another.
This was when I ran a store too small to hire help, so obviously I needed things to be accurate. The long-time vendors I got to know insisted on memorizing everything; there's no way I could conceive of remembering each category / brand / size's location both in front and back, allowing me to keep all things stocked. But a spatially-driven system like Excel in that case being on a mobile device, or tables in Kingdom of Loathing, I think the user would get the best of both; being able to browse, but directing your browser into the area you wanted. Recent developments with AJAX / Aria / DHTML / XML could really make this possible in a really object-oriented fashion.
Using the Kingdom of Loathing as a model - perhaps simplistic - one could see the visual environment isn't compromised at all for sighted users. More sighted people than blinks play KoL, and if I remember right, the map hasn't changed its structure. Only alt tags have been added for accessibility.; 3:23 PM