Archives For Research

One Week

March 2, 2015 — 1 Comment

Yesterday marked 5 years since I joined the Zooniverse team in Oxford, straight out of my PhD at Cardiff. It’s weird to say it but this week will be my final week here before I start a new role at Google, in London.

When I arrived at Zooniverse there were only two people here: Arfon Smith and Chris Lintott. Though there has always been a cloud of other researchers around the Zooniverse – they were the only only full time Zooniverse team. That changed a lot in the next 5 years!

The Zooniverse Team, May 2014

(Most of the) The Zooniverse Team, May 2014

 

I’ve never been one to fit in other peoples’ boxes, so Zooniverse suited me from the start. Unconventional, yet accessible; research, but not as we knew it. The Zooniverse has been a fantastic place to work. Indeed it still is. I’ve had the pleasure of building unique projects that have benefited astronomy and science. I’ve worked with remarkable researchers, developers, educators, and herders. It has been a lot of fun and I’ve been able to be part of its growth and evolution.

Over the years I have read many blogs and articles, usually written by someone leaving research, about how academia has a brain drain problem, or lacks a family-friendly environment, or can’t compete with industry. I have sometimes agreed, though usually quietly. Most of these pieces are dismissed by those left in academia, even if they are shared widely by them at the same time. I won’t be writing such a post Do I think academia is perfect? No. But no job suits everyone. Do I think that academia could do more for minorities, women, and families? Yes. But all jobs probably could. Being a postdoc has afforded me great flexibility with my time, and also given me the chance to travel and engage in awesome new ideas. It hasn’t given me stability though, and since I don’t want to be a professor, I’m not sure where it takes me as a career. I’d recommend it to everyone and noone at the same time. I’ve had a great time, but now it’s time to go. I’m terrified of course, but sometimes you have the make a giant leap when the opportunity presents itself.

Recycled Electrons and The Rewatch will both continue. The Rewatch will remain mostly unchanged, but you will hear less of me on Recycled Electrons – simply the result of time contraints. .Astronomy is also being taken care of, and I’ll blog about that separately. Rest assured though that #dotastro 7 and 8 are in hand.

It will be so sad to leave the Zooniverse, but I’m incredibly excited about Google. I’ll probably go quiet here for a while as I start my new job. I’m not gone though – just throwing myself into the new role, and meeting an exciting challenge head on. See you on the other side.

Since 2008 I have been running .Astronomy, which is a meeting/hackathon/unconference that aims to be better than normal meetings and to foster new ideas and collaborations. It’s a playground for astro geeks that is more specific than a general hack day, but way more freeform that a normal astronomy meeting. At .Astronomy we have developed into an amazing community.

I know people that have gotten jobs because of .Astronomy, changed careers because of .Astronomy – or even left astronomy because of .Astronomy (in a good way!). We have evolved into an interesting group, with a culture and way of thinking that we take back to our ‘real’ jobs after each event.

In short: it works. Now I’d like to work out how to spread the idea into more academic fields. We’re looking for people in other research areas, such as economics, maths, chemistry, medicine and more.

Adler Planetarium

I have funding from the Alfred P. Sloan Foundation to bring a handful of non-astronomers to this year’s .Astronomy, in Chicago at the amazing Adler Planetarium (December 8-10). The aim is to meet up at the end, and discuss whether you think it could work in your own field, and what you’d need to make that happen. If you’re a researcher, who isn’t an astronomer, and you think this sounds great then that could be you! We have funding to pay for flights, hotels and expenses. It will be a lot of fun – and despite the astronomy focus of the event, I think most researchers, with a bit of tech experience, would get a lot out of it.

If you’re interested then fill out the short form at http://bit.ly/dotastromulti or email me on rob@dotastronomy.com for more information. We are following a formal selection process, but we’re doing it very quickly and will decide by Nov 7th, to allow enough time ahead of the event to make travel plans and such. So don’t delay – do it now!

If you don’t think you’re the right person for this, then maybe you know who could be. If so, let them know and send them to http://dotastronomy.com/about/astronomy-6-multidisciplinary-program/ for more information.

The latest issue of Astronomy & Geophysics includes an article by your truly about the GitHub/.Astronomy Hack Day at the UK’s National Astronomy Meeting in Portsmouth earlier this year.

The projects resulting from hack days are often prototypes, or proof-of-concept ideas that are meant to grow and expand later. Often they are simply written up and shared online for someone else to take on if they wish. This ethos of sharing and openness was evident at the NAM hack day, when people would periodically stand up and shout to the room, asking for anyone with skills in a particular area, or access to specific hardware.

Take a look here: http://astrogeo.oxfordjournals.org/content/55/4/4.15.full?keytype=ref&ijkey=kkvGWSg3ABbIy5S

Martian Nyan Cat

Martian Nyan Cat

publications

Executable papers are a cool idea in research [1]. You take a study, write it up as a paper and bundle together all your code, scripts and analysis in such a way that other people can take the ‘paper’ and run in themselves. This has three main attractive features, as I see it:

  1. It provides transparency for other researchers and allows everyone to run through your working to follow along step-by-step.
  2. It allows your peers to give you detailed feedback and ideas for improvements – or do the improvements themselves
  3. It allows others to take your work and try it out on their own data

The main problem is that these don’t really exist ‘in the wild’, and where they do they’re in bespoke formats even if they’re open source. iPython Notebook is a great way of doing something very much like an executable paper, for example. Another way would be to bundle up a virtual machine and share a disk image. Executable papers would allow for rapid-turnaround science to happen. For example, let’s imagine that you create a study and use some current data to form a theory or model. You do an analysis and create an executable paper. You store that paper in a library and the library periodically reruns the study when new data become available [2]. The library might be a university library server, or maybe it’s something like the arXiv, ePrints, or GitHub.

This is roughly what happens in some very competitive fields of science already – only with humans. Researchers write papers using simulated data and the instant they can access the anticipated data the import, run and publish. With observations of the Cosmic Microwave Background (CMB) it is the case that several competing researchers are waiting to work on the data – and new data come sour very rarely. In fact that day after the Planck CMB data was released last year, there was a flurry of papers submitted to the arXiv. Those who got in early, likely had pre-written much of the work and simply ran their code as soon as they had downloaded and parsed new, published data.

If executable papers could be left alone to scan the literature for new, useful data then they could also look for new results from each other. A set of executable papers could work together, without planning, to create new hypotheses and new understanding of the world. Whilst one paper crunches new environmental data, processing it into a catalogue, another could use the new catalogue to update climate change models and even automatically publish significant changes or new potential impacts for the economy.

I should be possible to make predictions in executable papers and have them automatically check for certain observational data and automatically republish updated results. So one can imagine a topical astronomy example where the BICEP2 results would be automatically checked against any released Planck data and then create new publications when statistical tests are met. Someone should do this if they haven’t already. In this way, papers can continue to further, or verify, our understanding long after publication.

SKA Rendering (Wikimedia Commons)

SKA Rendering (Wikimedia Commons)

This is high-frequency science [3], akin to high-frequency trading, and it seems like an interesting approach to some upcoming data-flow issues in science. The Large Hadron Collider (LHC), Large Synoptic Survey Telescope) LSST, and Square Kilometre Array (SKA) are all huge scientific instruments set to explore new parts o the universe and gathering huge volumes of data to be analysed.

Even the deployment of Zooniverse-scale citizen science cannot get around the fact that instruments like the SKA will create volumes of data that we don’t know what to do with, at a pace we’ve never seen before. I wonder if executable papers, set to scour the SKA servers for new data, could alleviate part of the issue by automatically searching for theorised trends. The papers would be sourced by the whole community, and peer-reviewed as is done today, effectively crowdsourcing the hypotheses through publications. This cloud of interconnected, virtual researchers, would continuously generate analyses that could be verified by some second peer-review process; since one would expect a great deal of nonsense in such a setup.

When this came up at a meeting the other day, Kevin Page (OeRC) remarked that we might just be describing sensors. In a way he’s right – but these are software sensors, built on the platform and infrastructure of the scientific community. They’re more like advanced tools; a set of ghost researchers, left to think about an idea in perpetuity, in service of the community that created them.

I’ve no idea if I’m describing anything real here – of it’s just an expression of way of partially automating the process of science. The idea stuck with me and I found myself writing about it to flesh it out – thus here is a blog post – and wondering how to code something like it. Maybe you have a notion too. If so, get in touch!

———-

[1] But not a new one really. It did come up again at a recent Social Machines meeting though, hence this post.
[2] David De Roure outlined this idea quite casually in a meeting the other day, I’ve no ice air it’s his or just something he’s heard a lot and thought was quite cool.
[3] This phrasing isn’t mine, but as soon as I heard it, I loved it. The whole room got chatting about this very quickly so provenance was lost I’m afraid.

Today is the start of the UK National Meeting in Portsmouth. I’ll be there tomorrow, and running the NAM Hack Day on Wednesday with Arfon Smith – which is going to be awesome. Today at NAM, the nation’s astronomers will discuss the case for UK involvement in the Large Synoptic Survey Telescope project – the LSST. The LSST is a huge telescope, and a massive undertaking. It will change astronomy in a profound way.

A photograph and a rendering mix of the exterior building showing the dome open and road leading away from the site.

A photograph and a rendering mix of the exterior LSST building, showing the dome open and road leading away from the site.

With every image it takes, the LSST will be able to record very a large patch of sky (~50 times the size of the full Moon). It will take more than 800 images each night and can image its* entire sky twice a week! Billions of galaxies, stars, and solar system objects will be seen for the first time and monitored over a period of 10 years. Crucially it will use it’s rapid-imaging power to look for moving or ‘transient’ things in the night sky. It will be an excellent tool for detecting supernova, asteroids, exoplanets and more of the things that move from night-to-night or week-to-week. For example, the LSST could be used to detect and track potentially hazardous asteroids that might impact the Earth. It will also help us understand dark energy – the mysterious force that seems to keep our universe expanding – by mapping the precise location of billions of galaxies.

I’ve recently become LSST:UK’s Public Data Coordinator – think ‘chief hacker’ if you prefer. The LSST’s unprecedented archive of data will be a resource we can tap into to create new kinds of public outreach tools, data visualisations, and citizen science. In recent years, we at the Zooniverse have pioneered citizen science investigations of data in astronomy**. The citizen science and amateur astronomy communities around the UK, and the world, will be able to access the amazing data that comes out of the LSST both through structure, Zooniverse-style projects but also in a more freeform manner. The potential for discovery will be on a scale we haven’t seen before. It’s very exciting.

The LSST is a public-private partnership and is led by the United States. The unique scientific opportunities presented by the LSST have led to the formation of a group of astronomers from more than 30 UK universities. We’ll be asking for funding from the Science and Technology Facilities Council to support UK participation in the project.

Spinnaker Tower from the Gosport Ferry

Spinnaker Tower from the Gosport Ferry

If you’re at NAM this week, then I’ve love to talk about LSST, hacking on data, and Zooniverse. On Wednesday you’ll find me in the Park Building, at the University of Portsmouth at the GitHub/.Astronomy NAM 2014 Hack Day. I’ll also be at the GitHub drink up on Tuesday night at The White Swan from 7pm – where you can enjoy some of the finest cask ales, draught beers and wines in Portsmouth – and GitHub are paying! More details at https://github.com/blog/1849-github-meetup-in-portsmouth-uk.

* i.e. the sky visible from its location – not literally the entire sky
** We’ve now had more than 1 million volunteers pass through our digital doors.

A new Milky Way Project paper was published to the arXiv last week. The paper presents Brut, an algorithm trained to identify bubbles in infrared images of the Galaxy.

bubble_gallery_sorted_v2

Brut uses the catalogue of bubbles identified by more 35,000 citizen scientists from the original Milky Way Project. These bubbles are used as a training set to allow Brut to discover the characteristics of bubbles in images from the Spitzer Space Telescope. This training data gives Brut the ability to identify bubbles just as well as expert astronomers!

The paper then shows how Brut can be used to re-assess the bubbles in the Milky Way Project catalog itself, and it finds that more than 10% of the objects in this catalog are really non-bubble interlopers. Furthermore, Brut is able to discover bubbles missed by previous searches too, usually ones that were hard to see because they are near bright sources.

At first it might seem that Brut removes the need for the Milky Way Project –  but the ruth is exactly the opposite. This new paper demonstrates a wonderful synergy that can exist between citizen scientists, professional scientists, and machine learning. The example outlined with the Milky Way Project is that citizens can identify patterns that machines cannot detect without training, machine learning algorithms can use citizen science projects as input training sets, creating amazing new opportunities to speed-up the pace of discovery. A hybrid model of machine learning combined with crowdsourced training data from citizen scientists can not only classify large quantities of data, but also address the weakness of each approach if deployed alone.

We’re really happy with this paper, and extremely grateful to Chris Beaumont (the study’s lead author) for his insights into machine learning and the way it can be successfully applied to the Milky Way Project. We will be using a version of Brut for our upcoming analysis of the new Milky Way Project classifications. It may also have implications for other Zooniverse projects.

If you’d like to read the full paper, it is freely available online at at the arXiv – and Brut can found on GitHub.

[Cross-posted on the Milky Way Project blog]

milkyway

Just over three years the Zooniverse launched the Milky Way Project (MWP), my first citizen science project. I have been leading the development and science of the MWP ever since. 50,000 volunteers have taken part from all over the world, and they’ve helped us do real science, including creating astronomy’s largest catalogue of infrared bubbles – which is pretty cool.

Today the original Milky Way Project (MWP) is complete. It took about three years and users have drawn more than 1,000,000 bubbles and several million other objects, including star clusters, green knots, and galaxies. It’s been a huge success but: there’s even more data! So it is with glee that we have announced the brand new Milky Way Project! It’s got more data, more objects to find, and it’s even more gorgeous.

Screenshot 2013-12-12 11.58.42

This second incarnation of my favourite Zooniverse project[1] has been an utterly different experience for me. Three years ago I had only recently learned how to build Ruby on Rails apps and had squirrelled myself away for hours carefully crafting the look and feel for my as-yet-unnamed citizen science project. I knew that it had to live up to the standards of Galaxy Zoo in both form and function – and that it had to produce science eventually.

Building and launching at that time was simpler in one sense (it was just me and Arfon that did most of the coding[2]) but so much harder as I was referring to the Rails manual constantly and learning Amazon Web Services on the fly. This week I have had the help of a team of experts at Zooniverse Chicago, who I normally collectively refer to as the development team. They have helped me by designing and building the website and also by integrating it seamlessly into the now buzzing Zooniverse infrastructure. The result has been an easier, smoother process with a far superior end result. I’ve essentially acted more like a consultant scientist, with a specification and requirements. I’ve still gotten my hands dirty (as you can see in the open source Milky Way Project GitHub repo) but I’ve managed to actually keep doing everything else I now to day-to-day at the Zooniverse. It’s been a fantastic experience to see personally how far we’ve come as an organisation.

The new MWP is being launched to include data from different regions of the galaxy in a new infrared wavelength combination. The new data consists of Spitzer/IRAC images from two surveys: Vela-Carina, which is essentially an extension of GLIMPSE covering Galactic longitudes 255°–295°, and GLIMPSE 3D, which extends GLIMPSE 1+2 to higher Galactic latitudes (at selected longitudes only). The images combine 3.6, 4.5, and 8.0 µm in the “classic” Spitzer/IRAC color scheme[3]. There are roughly 40,000 images to go through.

GLM_261.3032+00.8282_mosaic_I124

An EGO (or two) sitting in the dust near a young star cluster

The latest Zooniverse tech and design is being brought to bear on this big data problem. We are using our newest features to retire images with nothing in them (as determined by the volunteers of course) and to give more screen time to those parts of the galaxy where there are lots of pillars, bubbles and clusters – as well as other things. We’re marking more objects –  bow shocks, pillars, EGOs  – and getting rid of some older ones that either aren’t visible in the new data or weren’t as scientifically useful as we’d hoped (specifically: red fuzzies and green knots).

It’s very exciting! I’d highly recommend that you go now(!) and start classifying at www.milkywayproject.org – we need your help to map and measure our galaxy.

—–

[1] It’s like choosing between your children

[2] Arfon may recall my resistance to unit tests

[3] Classic to very geeky infrared astronomers

So… I’m a TED Fellow

November 19, 2013 — 2 Comments

TED

I’m happy to announce that I am one of the 2014 TED Fellows. It’s a fantastic opportunity and an awesome group to be a part of – you can see everyone else in the class on the TED Fellow blog. It an exciting time to join the TED crowd as TED is celebrating its 30th year, which includes a move to Vancouver and the theme of ‘the next chapter’. So I don’t just get to go to TED but new TED. Good stuff. I have been welcomed into the club by several TEDsters already – what a great group.

I’m hoping to meet amazing people, learn about ambitious, crazy projects and just be inspired.

KOI-351

We recently posted news of a Planet Hunters planet discovered as part of a seven-planet system. Like all the Planet Hunters stars this is one seen in data from NASA’s Kepler spacecraft. Dubbed Kepler-90 this system is a peculiar microcosm of our own Solar System, with small (probably rocky) worlds in the middle, and larger (probably gaseous) worlds on the outside. The major different being that the outermost planet in this system is as far from the star as Earth is from the Sun. The other six planets in this system were already known about, but thanks to volunteers on Planet Hunters (http://planethunters.org) we now think that there are seven worlds circling this stars, which is just a little brighter than our Sun.

New PH Planet

To celebrate this fact I have created a model of the whole planetary system in Celestia, an awesome, cross-platform, open-source package that lets you explore space. You can download the Celestia files model directly here or watch the video below to be taken on a tour of Kepler-90 and it’s seven worlds.

In this video, I’ve given the newly discovered Planet Hunters candidate some fetching green rings – which we do not have any evidence for or against. Also keep in mind that we know very little about what most exoplanets look like, so we’ve used artistic license to give them all different appearances, often using the surface of what might be analogue worlds in our Solar System. Maybe you can spot some familiar surfaces amongst them!

This system has some great features that make it interesting. The outermost world is roughly the the size of Jupiter but orbits at almost exactly the Earth-Sun distance of 1AU. A Jupiter-like world in an Earth-like orbit has been seen before in Planet Hunters discoveries. The middle planet in this system is at the same distance from this star as Mercury is from our Sun, but is six times as large. The rest of the planets whizz around in even smaller orbits. This star is a little hotter than our Sun so they are pretty scorching places with surfaces temperatures in the hundreds of degrees – nearly a thousand for the innermost planets.

Inner System of KOI-351

The two innermost planets are roughly Earth sized and are really cool. The innermost one is 1.02x the diameter of Earth and the next is 1.18x. We assume that they are both rocky since they are so small. They orbit the star in just 7 days and 9 days respectively and are very close together. So close in fact that if you’re living on the inner, smaller planet then every few weeks, for about a week, the second planet appears in the sky about half the size of our full Moon.

Every year I see the rumour going round that Mars is going to be as big as the full moon. It will never happen for us – but on the tiny worlds circling Kepler-90, it happens all the time.

Update: The system used to be called KOI-351 but was given the name Kepler-90 just a day after this post went live. I have updated the name of the system in the text.

[Cross-posted on the Planet Hunters blog]

UNAWE's Citizen Science Astronomy Projects Poster - CAP conference 2013

This is a poster from CAP2013, which  am attending in Warsaw. Love the idea and the design. Follow @UNAWE on Twitter and find them online at http://unawe.org/.