Archives For milky way project

A new Milky Way Project paper was published to the arXiv last week. The paper presents Brut, an algorithm trained to identify bubbles in infrared images of the Galaxy.

bubble_gallery_sorted_v2

Brut uses the catalogue of bubbles identified by more 35,000 citizen scientists from the original Milky Way Project. These bubbles are used as a training set to allow Brut to discover the characteristics of bubbles in images from the Spitzer Space Telescope. This training data gives Brut the ability to identify bubbles just as well as expert astronomers!

The paper then shows how Brut can be used to re-assess the bubbles in the Milky Way Project catalog itself, and it finds that more than 10% of the objects in this catalog are really non-bubble interlopers. Furthermore, Brut is able to discover bubbles missed by previous searches too, usually ones that were hard to see because they are near bright sources.

At first it might seem that Brut removes the need for the Milky Way Project –  but the ruth is exactly the opposite. This new paper demonstrates a wonderful synergy that can exist between citizen scientists, professional scientists, and machine learning. The example outlined with the Milky Way Project is that citizens can identify patterns that machines cannot detect without training, machine learning algorithms can use citizen science projects as input training sets, creating amazing new opportunities to speed-up the pace of discovery. A hybrid model of machine learning combined with crowdsourced training data from citizen scientists can not only classify large quantities of data, but also address the weakness of each approach if deployed alone.

We’re really happy with this paper, and extremely grateful to Chris Beaumont (the study’s lead author) for his insights into machine learning and the way it can be successfully applied to the Milky Way Project. We will be using a version of Brut for our upcoming analysis of the new Milky Way Project classifications. It may also have implications for other Zooniverse projects.

If you’d like to read the full paper, it is freely available online at at the arXiv – and Brut can found on GitHub.

[Cross-posted on the Milky Way Project blog]

milkyway

Just over three years the Zooniverse launched the Milky Way Project (MWP), my first citizen science project. I have been leading the development and science of the MWP ever since. 50,000 volunteers have taken part from all over the world, and they’ve helped us do real science, including creating astronomy’s largest catalogue of infrared bubbles – which is pretty cool.

Today the original Milky Way Project (MWP) is complete. It took about three years and users have drawn more than 1,000,000 bubbles and several million other objects, including star clusters, green knots, and galaxies. It’s been a huge success but: there’s even more data! So it is with glee that we have announced the brand new Milky Way Project! It’s got more data, more objects to find, and it’s even more gorgeous.

Screenshot 2013-12-12 11.58.42

This second incarnation of my favourite Zooniverse project[1] has been an utterly different experience for me. Three years ago I had only recently learned how to build Ruby on Rails apps and had squirrelled myself away for hours carefully crafting the look and feel for my as-yet-unnamed citizen science project. I knew that it had to live up to the standards of Galaxy Zoo in both form and function – and that it had to produce science eventually.

Building and launching at that time was simpler in one sense (it was just me and Arfon that did most of the coding[2]) but so much harder as I was referring to the Rails manual constantly and learning Amazon Web Services on the fly. This week I have had the help of a team of experts at Zooniverse Chicago, who I normally collectively refer to as the development team. They have helped me by designing and building the website and also by integrating it seamlessly into the now buzzing Zooniverse infrastructure. The result has been an easier, smoother process with a far superior end result. I’ve essentially acted more like a consultant scientist, with a specification and requirements. I’ve still gotten my hands dirty (as you can see in the open source Milky Way Project GitHub repo) but I’ve managed to actually keep doing everything else I now to day-to-day at the Zooniverse. It’s been a fantastic experience to see personally how far we’ve come as an organisation.

The new MWP is being launched to include data from different regions of the galaxy in a new infrared wavelength combination. The new data consists of Spitzer/IRAC images from two surveys: Vela-Carina, which is essentially an extension of GLIMPSE covering Galactic longitudes 255°–295°, and GLIMPSE 3D, which extends GLIMPSE 1+2 to higher Galactic latitudes (at selected longitudes only). The images combine 3.6, 4.5, and 8.0 µm in the “classic” Spitzer/IRAC color scheme[3]. There are roughly 40,000 images to go through.

GLM_261.3032+00.8282_mosaic_I124

An EGO (or two) sitting in the dust near a young star cluster

The latest Zooniverse tech and design is being brought to bear on this big data problem. We are using our newest features to retire images with nothing in them (as determined by the volunteers of course) and to give more screen time to those parts of the galaxy where there are lots of pillars, bubbles and clusters – as well as other things. We’re marking more objects –  bow shocks, pillars, EGOs  – and getting rid of some older ones that either aren’t visible in the new data or weren’t as scientifically useful as we’d hoped (specifically: red fuzzies and green knots).

It’s very exciting! I’d highly recommend that you go now(!) and start classifying at www.milkywayproject.org – we need your help to map and measure our galaxy.

—–

[1] It’s like choosing between your children

[2] Arfon may recall my resistance to unit tests

[3] Classic to very geeky infrared astronomers

An amazing image I found on the Milky Way Project. Despite running the project, I still find amazing things there all the time.