Multimodal Navigation: More Details

I'd like to continue the discussion of the system I am proposing. I would like a flexible multi-sensor and multi-interface architecture, but the fewer the better. I am thinking of having three modes of operation, where at any time at least two modes are active. These modes are outdoors, indoors and backup. The outdoors mode is most certainly based on (D)GPS, and the indoors mode could use WLAN or RFID for example. It shouldn't be too difficult to determine if the user is indoors or oudoors and so this should be done by the system. The fewer user-issued commands the better. The backup mode has two purposes: to provide some minimal functionality even where the other modes fail (for example in a building without an indoor positioning setup), and to make the other modes just a little more stable by continuously trying to figure out whether the indoors or outdoors mode is providing correct information. In order for the latter feature to work the information from outdoors<->backup and indoors<->backup needs to be comparable, which could be a tricky issue. I guess obstacle avoidence could be said to be part of the backup mode as well since it's something that should run at all times. I am still uncertain as to what hardware the backup mode would use; Is machine vision still insufficient? Could laser rangefinders provide enough information? Or might a machine learning approach with a lot of different non-navigational sensors (as described earlier) work? The backup unit should obviously not rely on a map (or anything external), so how exactly can it help the other modes? Those are questions still needing an answer.

When it comes to interfaces I am thinking of two: an audio interface and a haptic one. The primary information channel should be a combination of speech and non-speech audio delivered through stereo bone-conduction headphones (like in the SWAN system). An electronic compass should keep track of where the head is turned, and non-speech beacon sounds should mark the path the user is to take. This essentially augments the reality by overlaying auditory beacons on the GPS (or other) grid. Here it is important not to place the beacons too far apart (it is not enough to place beacons at intersections). Speech synthesis can be used to indicate points of interests and other things that are not related to the navigation task directly. The verbosity of information should adapt to the user's familiarity with the place. The major advantage of the clear difference in purpose between speech and non-speech audio is that it makes it easy to distinguish different types of information intuitively; The mvoement instructions are always the beacon sounds. Speech recognition can be used to provide input. Additionally, the microphone could perhaps determine if the environment is too loud to convey the auditory information (being at a concert for instance). To still be able to navigate in such a situation, the auditory beacons could be translated into haptic feedback in a belt. If haptic feedback can be provided in all directions around the body, then it is quite natural to feel the beacons rather than hearing them. There might also be other uses for the haptic interface.

In the (hopefully not too distant) future where these kinds of systems are common assistive tools, a lot of additional functionality can be added beyond navigation.

Multimodal Navigation

Monday, April 12, 2010

More Details

No comments:

Post a Comment

Pages

Blog Archive

Who Am I?

Links

My Sites' Total Counter