Charles' University Blog: ELROND

Showing posts with label ELROND. Show all posts

Monday, February 27, 2012

Progress is progress…

So since my little excursion to Greenwich, life has been dribbling along very nicely indeed. I’ve got some plans for updating some websites which I currently run, namely moving the www.stmarybalham.org.uk site over to a php server running WordPress… I’m doing it to get more experience running my own server, and writing php; but also because I wrote the code for this website before I came to university and since then I’ve learnt some better techniques for dynamic website building than I used back then. It’ll be a challenge, but something I’m looking forward to digging my teeth into once this whole university thing is over.

My final project is coming along, it now shows a lot more information and I’m in the process of building a test rig that can assess the accuracy and performance of the system so far for benchmarking and to get some pretty graphs to go into my write up :)

And I got my results last week for my exams in January, I did really well! I got a first in all my modules and I’m averaging 82% for my degree :) Only 30% of my degree left! The end is in sight, finally!

And for those of you who are interested, the next video about my stupid housemates who are living off £1 for food for a month is up:

I didn’t think my voice could go that high for “Top tip of the week”… turns out I was wrong.

Happiness. 8 out of 10
Tiredness. 1 out of 5
Workload. 5 out of 10 (but it’s all interesting stuff…)
Last Meal. Fish and chips
Song of the day. Red Hot Chilli Peppers - Scar Tissue
Thought for the day. Don’t know whether giving my housemates a beta version of my app was a good idea… :p
What I’m Doing Now. About to do some more work… after lunch….

Wednesday, February 15, 2012

The final semester begins…

My last semester of lectures started last week and because of my cunning plan to front load my module choices for final year I only have two modules and my project work to do. While I regretted that decision in January, having to sit 4 exams, coursework and write a report for my project it all seems worth it now.

I’ve got a couple of marks back for coursework already, and they are both firsts, so with any luck I should still be on track. I’ll get my exam results before the end of the month and that will be a really good indicator of how I’m doing since I’ve now completed over 70% of my degree…

My new timetable is only 7 hours a week! 3 hours on Tuesday, 3 hours on Thursday and an hour on Friday. This gives me loads of time to work on my project witch is going really well (thanks for asking :P)

My project, ELROND, can now do lots of the things I explained in this diagram a while ago, and there’s still plenty of time for refinements and improvements which should give me lots to talk about in my final report. This makes me happy :)

At the moment my housemates Will and Mike are attempting to eat for a month on only £1 of food a day. I’m helping make a weekly video diary for MADTV:

That’s my voice in the intro :P

What else has been going on? Well I’m starting to think more seriously about life after uni, what to do, where to live, whether to go travelling for a while before starting work and so on… So far I’m planning a couple of weeks in California with Charlotte before starting work for the rest of my life. I’d also like to go on holiday with my housemates, but that’s just an idea at the moment.

It’s all exciting times, and Charlotte and I are off to London this weekend to see Mr Scruff play at Koko. I went last year with Julie and it was AMAZING! I’m really looking forward to round 2!

Happiness. 8 out of 10
Tiredness. 1 out of 5
Workload. 3 out of 10
Last Meal. Kung Pao Chicken!
Song of the day. Ed Sheeran’s Plus Album, it’s all good :)
Thought for the day. I was there when this happened!
What I’m Doing Now. Thinking about buying a Linode…

Sunday, November 27, 2011

ELROND #2: The Pipeline…

The second post about my final year Electronic Engineering project to identify landmarks using photographs taken from a mobile phone in seconds.

Imagine you are walking around London and you come across a building that you want to know more about. You whip out your mobile phone and open up the ELROND app. By taking a picture of the building you want to know more about, ELROND returns relevant information in just a few seconds. The name and purpose of the facility, weather it is open to the public, the opening times, the phone number of reception, a history, appropriate web links… wouldn’t that be useful? Well that’s just one potential application of my project.

It could equally be used to display information about portraits in a gallery just by holding your smartphone in front of a painting, with relevant information overlayed on the screen in real-time…

Or just as an alternative to GPS to locate you when GPS is not available… such as indoors…

My project, titled Elrond, aims to provide the backbone or infrastructure to enable such apps to be written much quicker. But how does it work? how does that picture of a building turn into information?

The diagram below gives an overview of the process.

Once the Android application has extracted the features from the image of the building using a feature detection algorithm (more on that in a later post) the extracted information is packaged into an XML format and transmitted over a data network (3G or Wi-Fi) to a Linux web server.
The web server then parses the XML file into a format it can understand.
Each extracted feature is then compared to all known features by Elrond, hence gathering a shortlist of which buildings this is most likely to be a picture of. Because the number of known features is likely to be in the order of millions, a neat way of searching the set needs to be used, called a KD-Tree.
The searching returns a shortlist of images that most probably match the query image. The items at the top of the shortlist are the most likely, so the top 100 results are looked into in more detail. Elrond will look at all the features in both images, find ones that match, and store their locations. Then a homograph is calculated to see if there is a way to map the features from the query image to the stored image. A homograph is a matrix that describes the best way to map two sets of points to each other. If the homograph can map lots of points between the images, then that image is given a high match score.
After all the homographs have been calculated, the best matching image can be determined; or no match is found, in which case Elrond cannot return any information. Assuming a reasonable match was found, Elrond now knows what the building is! Relevant information about that building is maintained on a database, and so it can be fetched.
The database information is packaged into another XML file and sent to the mobile device.
The mobile device interprets the XML and graphically displays the information on the screen. Voilà!

I know that some people are reading this gormless, but for those that are interested to know more I want to write more posts about how specific parts of the application work and perform. So if you have any thoughts and suggestions let me know.

The next post will be about less techy things…

Thursday, November 10, 2011

ELROND #1: Feature Detection in Computer Vision

The first post about my final year Electronic Engineering project. I introduce you to the topic of feature detection and what my project is about.
Feature detection is an important problem in computer vision. Computer Vision is the study of extracting information from the real world and somehow use it as an input to computer programs. A simple example of Computer Vision would be how Gollum was created for the Lord of the Rings films…

The creation of Gollum used motion capture from real world camera shots to digitally create the character.

Feature detection is the process of computer algorithms detecting interesting, or “perceptually interesting” locations of an image. Early feature detection worked by detecting edges within an image, but there were several problems with this approach.
Edge Detection

^ from Wikipedia
One of the biggest uses of feature detection is a process known as Image Registration, where a computer can find similar images and determine that they are actually of the same building/object/face/landscape/tree/etc. And this is the basis for my project.
everyday object recognition

^ originally from http://ils.intel-research.net/ (now removed)
The premise of my project, titled Scalable Landmark Recognition on Mobile Devices, is to allow anyone with a mobile phone to take a photograph of a landmark and be told what it is, where it is and some interesting facts about it. For example, you could take a picture of the Eiffel Tower and phone app would recognise the landmark and return information about when it was built, how tall it is, how much ticket prices are today, opening times etc.

^ from http://mastersofmedia.hum.uva.nl/2011/10/10/app-review-google-goggles/
This will be a downloadable mobile app for Android phones, and it will use feature detection to find similar images in it’s database and return information about the closest match. The app will require internet access to communicate with a server that will run the query and holds the database.
The project was dubbed “ELROND” by Anne, who I lived with in second year, because it’s close to SLROMD (which is what the actual abbreviation would be) and it also has Lord of the Rings references, which I’m all for!
So far I’ve got most of the feature detection working, and it should be quick enough to identify a match from 1000’s of images in under a second… but only time will tell. I’ll explain more about how my current solution works in a later post, but for now here’s a screenshot of it working on my computer…

On the top-right you can see the query image, this would be the image taken on the phone.
Below it is the image matched to it from the database.
The white lines are lines between features matched between the query and database image (using a process called FLANN matching).
The green box shows where the query image would fit onto the database image if you were to stitch them together in a panorama (called a homography).
The top-middle window (with the red circles) shows the features found in the query image. The features are found using the SURF algorithm.
The left of the image shows the code output, showing the progress of the database search and how the query image matched to the other images in the database.