Multimodal Navigation: 2010

Tuesday, May 4, 2010

Bone-Conduction Headphones Commercially Available

Yes they are!
http://www.audioboneheadphones.com/

Sunday, April 18, 2010

Sunday with Single-Camera SLAM

It's Sunday and I just felt like writing more about single-camera SLAM. I think it's quite a remarkable achievment. Basically, using a single camera that continually looks for certain features (landmarks), it is possible to create a map in real-time of the environment without knowing any odometry information (which is usually available when SLAM is applied to mobile robots), and without the ability to perceive depth (no stereo vision).

References on this can be found in the (shortly up) updated database on the resources page.

Thursday, April 15, 2010

An Amazing Paper on Vision-based Methods

Just found an amazing paper that deserves its own blog post. It's a huge survey of vision-based mobile robot navigation methods written by Kak and DeSouza in 2002. The goal, as stated by the authors, was to write about vision-based techniques from "the last 20 years". The article contains more than 150 references, and can be found here: http://portal.acm.org/citation.cfm?id=505501.505509.

I should upload the BiBTeX database again, as it contains some more new references.

Monday, April 12, 2010

More Details

I'd like to continue the discussion of the system I am proposing. I would like a flexible multi-sensor and multi-interface architecture, but the fewer the better. I am thinking of having three modes of operation, where at any time at least two modes are active. These modes are outdoors, indoors and backup. The outdoors mode is most certainly based on (D)GPS, and the indoors mode could use WLAN or RFID for example. It shouldn't be too difficult to determine if the user is indoors or oudoors and so this should be done by the system. The fewer user-issued commands the better. The backup mode has two purposes: to provide some minimal functionality even where the other modes fail (for example in a building without an indoor positioning setup), and to make the other modes just a little more stable by continuously trying to figure out whether the indoors or outdoors mode is providing correct information. In order for the latter feature to work the information from outdoors<->backup and indoors<->backup needs to be comparable, which could be a tricky issue. I guess obstacle avoidence could be said to be part of the backup mode as well since it's something that should run at all times. I am still uncertain as to what hardware the backup mode would use; Is machine vision still insufficient? Could laser rangefinders provide enough information? Or might a machine learning approach with a lot of different non-navigational sensors (as described earlier) work? The backup unit should obviously not rely on a map (or anything external), so how exactly can it help the other modes? Those are questions still needing an answer.

When it comes to interfaces I am thinking of two: an audio interface and a haptic one. The primary information channel should be a combination of speech and non-speech audio delivered through stereo bone-conduction headphones (like in the SWAN system). An electronic compass should keep track of where the head is turned, and non-speech beacon sounds should mark the path the user is to take. This essentially augments the reality by overlaying auditory beacons on the GPS (or other) grid. Here it is important not to place the beacons too far apart (it is not enough to place beacons at intersections). Speech synthesis can be used to indicate points of interests and other things that are not related to the navigation task directly. The verbosity of information should adapt to the user's familiarity with the place. The major advantage of the clear difference in purpose between speech and non-speech audio is that it makes it easy to distinguish different types of information intuitively; The mvoement instructions are always the beacon sounds. Speech recognition can be used to provide input. Additionally, the microphone could perhaps determine if the environment is too loud to convey the auditory information (being at a concert for instance). To still be able to navigate in such a situation, the auditory beacons could be translated into haptic feedback in a belt. If haptic feedback can be provided in all directions around the body, then it is quite natural to feel the beacons rather than hearing them. There might also be other uses for the haptic interface.

In the (hopefully not too distant) future where these kinds of systems are common assistive tools, a lot of additional functionality can be added beyond navigation.

Wednesday, April 7, 2010

Haptic Feedback

I have always felt that haptic feedback is an intuitive, vast resource largely unused. From a mobility perspective it also offers advantages. I have found some articles investigating the feasibility of haptic interfaces in navigation systems. One of them is http://portal.acm.org/citation.cfm?id=857199.858004

Wednesday, March 31, 2010

Progress report 5

Since last time I posted I have been writing more about using non-positioning type sensors in a navigation system. If this approach can be made to work reasonably well, it could be used as a backup in a complete system. I have also been thinking a bit ahead on how to treat data in a system with an arbitrary number of sensors. I have the idea of a model where a position determination is a weighing of different sensor's data, where the weights might perhaps be learnt at runtime, though for example, in an outdoor environment GPS should be the most important, whereas indoors it could be a WLAN positioning system. The model I am thinking of could take other non-positioning sensors into account, which I think could act as a backup, as (if not using vision) we can get a system that is much lighter on computational resources.

This is probably one of the more confusing posts, but I am just starting to think about the model, and more details will come as I go. I'll add a new reference on non-positioning sensors to the bottom of the BiBTeX database right now. The approach used in that paper is machine learning and a "data cooking" module which they claim reduced the navigation error rate of this approach down to 2%.

I will be away for five days. Happy Easter!

Thursday, March 25, 2010

Progress Report 4

Progress is a bit slow at the moment. Trying to plan out the next section. In the meantime, I'm adding a chapter on navigation using lasers, and also refining what I've written so far. From tomorrow and on I will have no other course to worry about, which is a good thing as I'm preparing to take on the final chapters of this thesis.

Also, when fed up with writing I've been looking at programming libraries (vision, audio, speech...) and some random things that pop into my head from time to time. I just wish computer vision algorithms were better, as vision has huge potential in a travelling aid for someone who lacks it. I want a system to read informational signs in the environment and to tell me if that bus over there is the right one. I'm also giving self-containment a priority, that is, the device should try to rely as little as possible on external information (which might not exist in some places). That doesn't mean such information is useless, however.

I've also tested an ER1 robot. I am able to move it and have it recognise (and say aloud) objects and follow them. It really likes looking at CD's.

Friday, March 19, 2010

Mobile Robot Navigation Techniques

A very useful resource on the topic: http://www.doc.ic.ac.uk/~nd/surprise_97/journal/vol4/jmd/

Tuesday, March 16, 2010

Progress Report 3

I have reached a milestone in my work where chapter 2 (theory and related work) is pretty much complete. This is the first time I am writing in a massively parallel style, and am enjoying it. I used to write continuously, but parallelising has several advantages, an important one being the motivational aspect. It felt less secure at first, but I think I am getting used to it.

Today I will do a clean-up of my BiBTeX reference database and upload the whole thing to the resources page on the right.

Saturday, March 13, 2010

RFID and Location Identification

I am at the moment reading about RFID and other sensors that might not at first thought be of any use at all in a navigation device. Everything from temperature sensors to magnetometers can be used though, as demonstrated by one of the systems I've seen. This is what I'd like to think of as location fingerprinting. I have collected some references that I am writing about.

I will update the resource page sometime with the new resources. It needs major cleaning up and proper citing as well. I will get around to that sometime!

Wednesday, March 10, 2010

ER1 Demonstration

Saturday, March 6, 2010

Progress Report 2

First off, let's celebrate passing the 4500 word mark, whatever that means. I don't like word or page counting anyway, possibly as little as I do computer spell checkers.

This week I have continued to write about the building blocks (GPS, AGPS, DGPS, WLAN, ...) and have written about two of the most complex systems I have found (Drishti and SWAN). They are excellent prototypes and contain much of the functionality I seek. Interestingly, I haven't encountered any navigation system trying the paradigm of machine learning. As I see it now, there are two major directions to go forward:
1. Make multimodality really work. There are already excellent commercial GPS systems for the blind; It is time to extend that. Find a way to put all kinds of sensors in an efficient device that lasts an acceptable time on batteries. Find clever ways of using all the information and presenting it. Provide some minimal functionality at all times.
2. Let the machine think. One must be careful here, but if done right this could lead to a much easier to use system. One of the important properties I think a system should possess is minimal (and quick) interaction. This is true for mobile devices in general, where it is important to be able to perform tasks quickly.
A system capable of learning would also make adaptation much easier. Users' habis would be picked up and the presentation and behaviour would be adjusted accordingly. Locations that are especially difficult would be noticed and presentation verbosity would be adjusted... Consistency of behaviour is important though, and so if the machine decides to behave differently it has to do so in a way the user expects, or else it could lead to much confusion.

Next week I will write about a couple of other systems, and also polish up this section and check that I actually evalate the systems based on my own recently-established criteria! Also, the pervasive computing viewpoint is something I didn't initially consider, but is something I should definitely consider, as it is a viable future research direction. This paradigm is already applied in some other aids for the disabled including devices for those with dementia.

Also realised something simple and obvious while out walking yesterday with my phone's GPS system: It is by far not enough to just give directions when approaching an intersection, curvy roads can be difficult at times in the winter! In the winter, the world is a new one every day, I use to say.

Saturday, February 27, 2010

Progress Report 1

Last week I did a lot of writing. This week has been more of a "plan ahead and think back" week. It is time to plan the development of the new system. The most important thing for now is to limit this to a task that can be completed in the limited timeframe of the thesis.

On the writing side of things I have continued my review of existing systems, but also the writing about the fundamental technologies that they rely upon (GPS, WLAN, RFID, computer vision).

One thing to remember is that a location can simply be defined as any kind of "fingerprint" associated with it. This is easy to forget when there are such obvious and intuitive techniques such as GPS. I have seen this done by looking at colour histograms of a scene, or using a multitude of basic sensors to obtain a "fingerprint" of a location. Very interesting.

Friday, February 26, 2010

Navigation Using Stereo Vision with an ER1 Robot

Video description
This video is a short demonstration of Automated Navigation using Grassfire Algorithm on input from Stereo Vision. The ER-1 Robot gets the input from the cameras and the program deduces a bird's eye view of the scenario. It then uses the navigation algorithm to plan a route and executes it.
Final Year Project at University of Plymouth - BEng Robotics and Automated Systems.

Completed Work

Up to this point, the following has been done:
- Introductory writing: background, problems, technical challenges, goal, aim...
- Sample scenarios leading to some criteria--and some properties--a good system should possess. Those criteria will guide the analysis of current systems and will provide a basis for the system to be proposed.
- Phone-based GPS solutions have been evaluated.
- I have gathered resources on other systems as well as underlying technologies. The next stage is to evaluate and explain those.
- Also started to look at practical matters (hardware, programming libraries, etc.)

Thursday, February 25, 2010

Resources

On the right hand side of this page you can find a 'Resources' page. I will put anything I come across that is relevant to the project there.

Introduction

The project is about multimodal navigation systems primarily targetting visually impaired people. Others who might benefit are those with memory loss or those getting into situations where they might be temporarily blinded such as fire fighters. Autonomous robot navigation can also benefit.
There are some commercially available GPS systems designed for visually impaired, as well as quite a few research prototypes using anything from GPS and WLAN to cameras and RFID tags. Those modes have been investigated separately, but relatively little has been done to fuse the technologies together to provide a richer and more accurate system that works in other environments than simply outdoors (GPS has this limitation). If we could create a very good multimodal system, the capabilities could extend far beyond simple navigation. Scene description would be enormously helpful for someone who can't use their eyes for that purpose. Other location-based services are possible and would be easy to integrate into such a system.
There are, however, many theoretical and practical issues that needs to be overcome. When information is received from multiple sources, the system needs to be able to judge how to use it and how to present it in an efficient and minimally distracting way. This can depend on both the user's preferences and his/her current environment. On the practical side, all those nice computations require energy and computing power (especially if computer vision is involved), and the system needs to be easy to carry around and use.
I am currently reviewing available systems both commercial and prototypes. I will post my progress on this blog.

Wednesday, February 24, 2010

Welcome

Welcome to MultiNav, my online journal about multimodal navigation aids. At the moment I am doing my mester's thesis project in this area. More info to come!

Multimodal Navigation