Wednesday, 24 July 2013

Goodbye...

As of 17th July 2013 I have graduated from the University of Bristol as a Master of Engineering in Computer Science. I obtained Upper Second Class Honours.

Further work on Photogorithms is currently on hold unless I have a lot of spare time on my hands.

If you wish to contact me, my details are available on alexsheppard.co.uk.

Saturday, 11 May 2013

Conclusions

By completing this project I have proposed a novel solution that uses image processing tools and techniques to aid photographers when processing and editing photographs, based on the field of computational photography.

Summary of Achievements

  • Researched image processing techniques- in order to process the images in the dataset
  • Face detection- using SURF keypoint detection and FLANN feature matcher
  • Scoring metric- to score the photographs and subsequently select the best photo from each group
  • Dataset- collected a dataset of 2700 graduation photos to test the solution
  • Eye & gaze detection- I created a new method for determining whether a subject was looking at the camera, using only the data available from a single image
  • Rule of thirds- the determination as to whether photographs obeyed the rule of thirds
  • Saving and UI- created a user interface was created to allow the end user to easily use the solution and adjust the weightings of the features
  • User testing- obtained the opinions of over 200 people to reduce bias and improve general accuracy and reliability
  • Platform- used C++ and the OpenCV library to implement the aims and objectives.

Friday, 10 May 2013

User Feedback Results


A huge thank you to everyone that took part in my user testing. A grand total of 185 people took part online via the survey and 18 in person looking at prints.


Results in Person

Blinking. All 18 users spotted that the subject was blinking in the left photograph and so agreed with what I thought was the right photo.

Not looking. Again, every user noticed that the subjects weren't looking in the right photo and agreed with my choice of choosing the left photo.

Large group composition. This photo created the biggest split between the users, with slightly more choosing the portrait orientation rather than the expected landscape. Many users said that they couldn't see any difference between the 2 photos, apart from the subject in the back right being slightly more visible in the portrait photo, while others commented that the portrait photo was zoomed in. In both these cases the users failed to realise the orientations were different.

Portrait composition. At first glance the users couldn't see any difference between the 2 images. After repositioning the photos in front of them, the majority realised the horizontal alignment was different and then as expected they identified the left photo as the better composition.

Landscape composition. After realising the previous composition technique, the users quickly applied the same logic to this photo and they all chose the right photograph as expected.

Feet. This pair took a while for users to decide on. Many required a hint as to what they thought of the feet before arriving at the expected response, while a couple decided the left photo was 'better arranged' or the subjects were 'leaning in more'.

Arms. This final pair also proved difficult. Some users said the slight rotations of the bodies were better in the left image, and after prompting, two thirds of the users agreed that arms looked better in the right photo.

Rule of thirds. All the users chose the best 2 images, but sometimes couldn't decide whether they preferred the subject dead central or on the right thirds line. The most observant commented that they would prefer a photo half way between 2 and 3, which was the photo that was originally taken.

Pairs results for feedback in person
Rule of thirds composition results for feedback in person

Results Online

Blinking. All but 4 users correctly identified the correct photo.

Not looking. All but 2 users agreed with me and identified the left photo where both subjects were looking.

Large group composition. 75% of users chose the portrait composition rather than the landscape. A lot of comments refer to the subject partially hidden in the back right, but there were also comments referring to 'landscape feels like a nicer crop'.

Portrait composition. Two thirds identified the left photo as the better composition. Many users said they couldn't see any difference, possibly because they couldn't move the images round unlike the users who were holding the prints.

Landscape composition. Again, two thirds identified the right photograph as expected and there were many comments about not being able to see a difference.

Feet. The majority of users arrived at the expected photo, after identifying the stance of the right subject but a few commented on that the left photo was 'more zoomed in' or that they couldn't see any noticeable difference.

Arms. Just over two thirds chose the expected photo with the other third commenting that the left subject was facing straight on, but still many found it was 'difficult to give a reason, left just looks nicer'. 

Rule of thirds. Photos 2 and 3 accounted for 91% of the users, with the remaining 9% being split between the remaining seven images. Users that chose photo 3 gave reasons such as they 'can see whole body, also for some reason prefer it slightly to the right', while 107 users said it was 'obvious' that it should be centred. A few commented that photo 3 obeyed the rule of thirds.
Pairs results for feedback online
Rule of thirds composition results for feedback online

Thursday, 21 February 2013

Dataset and Program Flow

A Correction

In the Pre-Christmas Update, I said that there were 2 libraries for face recognition in OpenCV. This is incorrect, as I had glossed over the difference between face detection and face recognition.
  • Face detection finds faces in an image and this is mainly what I will be using.
  • Face recognition matches an input face against a known library of faces. I am not interested in this as much at this stage, but it could be useful in grouping photos by recognising who is in each photo, and also if I create a gallery whereby a user can click on a face and find other photos with that person in it. 

Dataset

The dataset I will be using predominately throughout the project are University of Bristol graduation photos I have taken from summer 2012. This consists of just under 2700 photographs from a weeks shooting, which has been sorted by hand into roughly 3 groups:
  • good photos to keep
  • poor photos that are not needed
  • photos which are ok, but not amazing.
An interesting area of this project is how the computed 'good photos to keep' will compare to those as determined by a human.
The photos themselves can be separated into approximately different compositions such as the following:

Full length portrait
Full length portrait
Full length landscape
Full length landscape
3/4 length portrait
3/4 length portrait
1/2 length landscape
1/2 length landscape
head and shoulders landscape
head and shoulders landscape
The size of the dataset will also help to observe how the program copes with large, real world data which is just under 10GB in size.

Program Logic Flow

To help segment the code and ensure that all the stages have been included, a basic logic flow for the program has been constructed:
  • load images
  • group images
  • for each image
    • initialise metrics
    • face detection
    • face features
    • pupil tracking
    • arms
    • feet
    • ...
    • finialse metrics
  • remove bad photos
  • select best photo
  • copy to new folder.

Dissertation

Apart from the structure of the dissertation being created, little progress has been made in the way of writing.

Thursday, 7 February 2013

First Coding Day

Yesterday I started work on the coding for the project.

I had a quick look at previous OpenCV build environments in Visual Studio that I had used last year, which used a previous version of OpenCV (2.3.1). After confirming these still worked and all the linkers will still intact, I updated to the latest version 2.4.3. 

I then continued to create a new Visual Studio C++ project and created a simple C++ file that read in an image using both the C and C++ interfaces, similar to the tutorial described here. This resulted in the following:

Loading and displaying an image
Loading and displaying an image

Before finishing, I had a quick read on using the older FaceDetection library, which, as discussed in the previous post, may or may not be compatible with OpenCV 2. It turns out that it does work and using snippets of the facedetect.c file bundled with OpenCV, resulted in the following output image when used with the Haar cascade classifer:

FaceDetection with Haar Like Features
This (unedited) image is the first from the testing set I am going to use throughout the project. Details of the testing set shall be explained at a later date. As you can see, the result is not entirely accurate, but enough progress has been made on this first day.

Wednesday, 19 December 2012

Pre-Christmas Update

Initial Feedback

From my first presentation to the department, I received some feedback on my specification. Key outcomes include:
  • Retaining focus on the key areas, so that I can complete these well rather than doing lots of areas not so well
  • Based on the above, it especially true with the 2 potential directions that this project could take. Therefore the intelligent resizing methods will only be looked at if there is a lack of challenge and depth   to the feature finding, and only then will I look at intelligent resizing
  • A metric for measuring and judging the quality of the results. This can be based on rules programmed into the solution as well as possibly using crowd sourcing techniques
  • Evaluation. Based on the metric, but how can I explain and show whether the system is selecting the best images.

OpenCV and Face Recognition Libraries

To aid the image processing side of the project, I have decided to work with the OpenCV library. OpenCV uses the BSD license and so it's OK to use. Previous use of OpenCV in C++ and C++ in general means it should be quite easy to pick up again and the fact that it works cross platform maximises the project's potential usage. I will primarily be working on Windows, but will be testing on Linux as well.

After a little research I have found 2 libraries for face recognition in OpenCV.

FaceRecogniser Class

This class is actually included with the latest version of OpenCV 2.4, so it will make it far easier to get started with recognising faces within images. The library comes with 3 algorithms depending on how you wish to recognise the faces:
  • Eigenfaces
  • Fisherfaces
  • Local binary patterns histograms.

More info:
FaceRecognizer Tutorial

FaceDetection

The FaceDetection library is older- it uses OpenCV 1 and warns that there may be compatibility issues with later versions. A link to a 'new' version of a similar library that works with OpenCV 2 is provided that uses the 'cascade' classifier.

More info:
FaceDetection Examples

Dissertation

By the end of January I plan to have made substantial progress on my dissertation. Since I will have completed all of my background reading of papers and articles on the area of image processing and computational photography, I will be able to write the technical background section. This will also reinforce my knowledge in the area and it may reveal some potential issues I may come across later in the implementation, and so it will leave me enough time to find a solution.
In having to explain the basis of the project in the written form, I will have a definite objective as well as making sure all viewpoints are covered.

In short:
  • Introduction
  • Supporting technologies
  • Contextual background- explanation and motivation for the underlying problem
  • Technical background- information on related work so that the reader can understand the aim.

Sunday, 18 November 2012

Introduction


Photographers can spend a lot of time sorting through photos, trying to find the best photo of a particular group, so that everyone is looking, no one is blinking, all heads are visible etc.
My aim is to automate this by the means of computational photography, so that the time spent on this laborious but necessary step is reduced. I aim to detect features within the image that make up good and bad aspects of a photo and classify based on these features.
I myself have spent hours going backwards and forwards through a set of photos, struggling to decide which photos I like the best. Since at the end of the day, these photos may go into a gallery where the people in the photos can buy them, thus they will be more likely to buy the photos if they are given the best selection available.
An extension to this is to integrate face recognition into the gallery, so that the user is only shown photos which they are in.
Furthermore I aim to look at intelligent resizing methods to adjust the composition of images to obey the rule of thirds, automatically correct rotation etc. so that they look more pleasing.

Objectives

  • Determine features that can classify good and bad parts of an image
  • Collect data set of around 500 images 
  • Work out a metric for each feature as to how well it classifies images
  • Test on new, unknown images and real world testing
  • Refine features from testing
  • Create facial recognition gallery
  • Investigate intelligent resizing.