Kaisey S. Mandel: Type Ia Supernova Inference: Hierarchical Bayesian Statistical Models in the Optical and Near Infrared <slides>
Type Ia supernovae (SN Ia) are the most precise cosmological distance indicators, important for measuring cosmic acceleration and the properties of dark energy. Current and upcoming automated wide-field surveys will find large numbers of SN Ia, but cosmological inferences are already limited by systematic errors. Current cosmological analyses use optical light curves to estimate distances, whose accuracy is limited by the
confounding effects of host galaxy dust extinction. The combination of optical and near infrared (NIR) light curves and spectroscopic data has the potential to improve inference and distance predictions in supernova cosmology. I have constructed a principled, hierarchical Bayesian framework, described by a directed acyclic graph, to coherently model the multiple random and uncertain effects underlying the observed SN Ia data, including measurement error, intrinsic supernova covariances, host galaxy dust extinction and reddening, and distances. An MCMC code, BayeSN, efficiently computes probabilistic inferences for the parameters of individual SN and the hyperparameters of the population. Application to optical, NIR, and spectroscopic data demonstrates that the combination of optical and NIR data approximately doubles the precision of cross-validated SN Ia distance
predictions compared to using optical data alone, and estimates correlations between the intrinsic colors and characteristics of supernova spectral lines.

Jo Bovy: Quasar classification and characterization from broadband multi-filter, multi-epoch data sets <slides><poster>
Quasars—actively accreting supermassive black holes—are among the most luminous objects in the Universe. Large samples of quasars can be used to study topics including inflationary cosmology, the evolution of black hole growth over the course of cosmic history, and the physics of astrophysical black hole accretion. One of the major challenges for the peta-scale surveys of the future is to classify and estimate the distances to quasars without the need for expensive spectroscopic follow-up. I will present currently used techniques to classify quasars from broadband photometry, focusing on the XDQSO method—a probabilistic method that uses the extreme-deconvolution density estimation technique to handle missing and highly uncertain data—and a critical appraisal of other machine learning methods currently used. Going forward the major challenges will be to
(1) incorporate variability and astrometric data into the currently used color selection for optimal quasar selection, (2) separate quasars from galaxies (as opposed to stars) as we go to fainter magnitudes, and (3) strike a balance between data-driven, non-parametric methods—which work well for bright quasars—and template-based techniques—necessary for faint quasars where host-galaxy contamination of the observed flux is significant.

Gregor Rossmanith: Probing non-Gaussianities in the CMB on an incomplete sky using surrogates <slides>
For investigations of CMB data sets, the analysis of Fourier phases has proven to be a useful method, since all potential higher order correlations, which directly point to non-Gaussianities, are contained in the phases and the correlations among them. The method of surrogate maps with shuffled Fourier phases represents one way of analysing the phases. The shuffling approach relies on the orthogonality of the spherical harmonics, which only holds for the full sky. However, astrophysical foreground emission, mainly present in the Galactic plane, are a major challenge in CMB analyses. We demonstrate the feasibility to generate surrogates by Fourier-based methods also for an incomplete data set by transforming the spherical harmonics into a new set of basis functions that are orthonormal on the cut sky. The results show that non-Gaussianities and hemispherical asymmetries in the CMB as identified in several former investigations, can still be detected even when the complete Galactic plane is removed. We conclude that the Galactic plane cannot be the dominant source for these anomalies.

Heike Modest: Probing non-Gaussianities in the CMB with Minkowski Functionals and Scaling Indices using surrogates<slides>
We are analysing the cosmic microwave background (CMB) in respect to possible higher order correlations (HOCs) which would be indicators for non-Gaussianities in the primordial density field of the universe. Here, the analysis of the CMB Fourier phases is a promising and cosmological-model independent method. For generating so-called surrogate maps possible phase correlations of the Fourier phases of the original data (here the CMB map from the WMAP experiment) are destroyed applying a shuffling scheme to the maps in Fourier space. A comparison of the original maps and the surrogate maps then allows to test for the presence of HOCs in the original maps, also and especially on well-defined scales. Using Minkowski Functionals and Scaling Indices as test statistics for the HOCs in the maps we find deviations from the hypothesis of a Gaussian CMB with a significance of up to 10 sigma on largest scales, namely within the Fourier modes l from 0 to 20. We calculate the significance between the test statistics of the original data and the surrogates for different hemispheres in the sky and find hemispherical asymmetries as well as deviations from Gaussianity in the northern and southern sky. Calculating the significance for smaller parts of the sky enables us to locate certain regions in the southern sky that show deviations from Gaussianity while the signal found in the north vanishes.

Jacob Vanderplas, Andrew Connolly, Bhuvnesh Jain: Processing Shear Maps with Karhunen-Loeve Analysis <slides>
Wide-field probes of weak gravitational lensing have the potential to address fundamental questions about the nature of the universe. Measures such as the correlation function, power spectrum, or statistics of shear peaks can be compared with theoretical predictions to answer substantive question about the nature of dark matter, dark energy, gravity, and primordial perturbations. Comparison of the data to the theoretical model, however, can be subject to systematic effects due to survey geometry, selection functions, and other biases. This can be framed as a machine learning problem: given a sparse set of noisy observations, how can one best recover the underlying signal of interest? We propose to address these challenges using a compressed-sensing approach based on a Karhunen-Loeve (KL) model of the signal. This approach can efficiently recover the shear signal from
noisy data with arbitrary masking and survey geometry. The signal-to-noise-ranked KL vectors allow effective noise filtration, leading to a 30% decrease in B-mode contamination for simulated data. Furthermore, because the KL model is based on covariance matrices, it naturally encapsulates the two-point information of the field and provides a framework for efficient Bayesian likelihood analysis of the two-point statistics of a cosmological shear
survey.

Franois-Xavier Dupe, Jean-Luc Starck: Galaxy overdensity estimation: toward learning the missing data <slides>
As an important tracer of the matter in the universe, galaxy surveys are commonly used to study the matter distribution. However, these surveys present several problems. First they are subject to shot noise (i.e. Poisson noise, because they are counting maps), secondly most of the data in the galactic plane cannot be trusted or
is unavailable and missing data have to be properly taken into account. As we focus on medium scales, we assume that the matter overdensity, and so the galaxy overdensity, follow a log-normal distribution. Using a data augmentation framework, we propose a two-steps method for both denoising and inferring the missing data (inpainting). We begin by filling the missing data of the observation with a texture synthesis algorithm that try to
preserve the observed data and the assumed power spectra. As the texture synthesis is random, we can generate several complete observations (multiple imputations). Then, as we have completed observation, we estimate the galaxy overdensity using a MAP estimator with a log-normal prior and support preservation. Preliminary results are showed using both synthesis and real dataset. Finally, some extension are proposed where we learn
information needed to estimate the overdensity and fill the missing data, while still keeping a strong link the theory.

Marisa C. March, Roberto Trotta, L. Amendola, D. Huterer: Future dark energy probes and their robustness to systematics <slides>
We extend the Figure of Merit formalism usually adopted to quantify the statistical performance of future dark energy probes to assess the robustness of a future mission to plausible systematic bias. We introduce a new robustness Figure of Merit which can be computed in the Fisher Matrix formalism given arbitrary systematic biases in the observable quantities. We argue that robustness to systematics is an important new quantity that should be taken into account when optimizing future surveys. We illustrate our formalism with toy examples, and apply it to future type Ia supernova (SNIa) and baryonic acoustic oscillation (BAO) surveys. For the simplified systematic biases that we consider, we find that SNIa are a somewhat more robust probe of dark energy parameters than the BAO. We trace this back to a geometrical alignement of systematic bias direction with statistical degeneracy directions in the dark energy parameter space.

William B. March, Andrew Connolly, Alexander G. Gray: Efficient Estimation of N-point Spatial Statistics <slides>
Precise statistical analyses of astronomical data are the key to validating models of complex phenomena, such as dark matter and dark energy. In particular, spatial statistics are needed for large-scale sky catalogs. The n-point correlation functions provide a complete description of any point process and are widely used to understand astronomical data. However, the computational cost of estimating these functions scales as N^n for N
data points. Furthermore, these expensive computations must be repeated many times at many different scales in order to gain a detailed picture of the correlation function and to estimate its variance. Since astronomy surveys contain hundreds of millions or billions of points (and are growing rapidly), these computations are infeasible. We present a new approach based on multidimensional trees to overcome these computational obstacles. We build on the previously most efficient algorithm (Gray and Moore, 2001, Moore, et al., 2001) which improved over the N^n scaling of a direct computation. In this work, we incorporate the computations at different scales along with the variance estimation directly. We can therefore achieve an order of magnitude speedup over the current state-of-the-art method. We show preliminary scaling results on a mock galaxy catalog.

I. Sidorenko, C. Räth: Evaluation of the Topological and Morphological Characteristics of the LSS During Evolution Process by Means of Minkowski Functionals <slides><poster>
We study the topology of the cosmic Large-Scale Structures (LSS) produced by Millennium simulations (Springel , V. et al., 2005, Nature 432, 629) by means of Minkowski Functionals (MF). MF provide global morphological and topological characteristics of arbitrary structures. Applied to the density field of dark matter structure they reflect changes during time evolution of LSS. We analyse the simulated dark matter density field smoothed by Gaussian filters with different radii (from r=1.25 Mpc up to r=10 Mpc) at different evolution time (from z=127 up to z=0). We demonstrate that Gaussian smoothing with a large radius (r=10 Mpc) do not properly reflect topological changes in the dark mater structure during the evolution process and destroys filamentary structure of the LSS at present Universe (z=0), which remains present in the density field smoothed by the filters with small radii (r=1.25 Mpc or r=2.5 Mpc). Transformation of the LSS from a nearly random distribution of matter at early stage of the Universe to the filamentary structure at present time corresponds to the onset and increase of an asymmetry in MF_2, MF_3 and MF_4 with respect to the mean density value.

Adam Gauci, John Abela, Kristian Zarb Adami, Lance Miller: Neural Networks and GREAT10 Galaxies <slides><poster>
This work investigates the application of artificial neural networks (ANNs) to deblur galaxy postcards of the GREAT10 challenge. High resolution models are created and convolved with a given Point Spread Function (PSF) to generate the corresponding blurred images. These are then downsampled in Fourier space to obtain the resolution used in the challenge. Training examples for the ANN are created from original and the blurred postcards. An n X n, for some odd n, window in a blurred image is compared to the same window in the original images and the ANN learns to output the correct intensity of the middle pixel. This means that the intensities of neighbouring pixels are used in the input vector. Different weightings schemes for translating the output vector from the ANN into pixel values are investigated. The advantages gained by using different window sizes, pixel
encoding methods, and the number of hidden neurons in the ANN are also researched. The chi-squared error between the deblurred image and the original model is used to measure the performance.

Barnaby Rowe, Rachel Mandelbaum: The next weak lensing data challenge <slides>
One of the most profound mysteries in modern cosmology is the accelerated expansion of the universe (the discovery of which led to the 2011 physics Nobel Prize). Weak gravitational lensing, an observational method that has the potential to shed the most light on this mystery, relies on accurate measurement of the shapes of millions of galaxies to uncover tiny distortions caused by matter between the galaxies and us. However, accurately inferring the true galaxy shapes is complicated due to large distortions from the atmosphere, telescope optics, detector and pixel noise. As data arrives in greater quantities, requirements on measurement accuracy become more stringent, and weak lensing must now meet unprecedented image analysis challenges. This need has driven ongoing improvements to shape measurement algorithms, and led to the creation of public data analysis challenges, of which the STEP1, STEP2, GREAT08 and GREAT10 challenges are recent examples. Some approaches have been successfully honed and tested by astronomers, but winning entrants have also been found from the machine learning community. In this poster we summarize what has been learned about shape measurement systematics from previous challenges, and highlight critical issues for the field in the near future, which will be tested in the next weak lensing data challenge (currently under development).

S. Beckouche, J.-L. Starck, G. Peyre, J. Fadili: Dictionary Learning and Astronomical Image Restoration <slides>
Wavelets have been intensely used for astronomical image restoration during the last 20 years. However, wavelets have shown some limitations for images containing complexe texture features that can find in cosmic string maps or planetary images. We propose to use recently developed dictionary learning techniques to overcome those limitations. We address here the problem where a white gaussian noise is to be removed from
an image. The original image is assumed to be sparsly represented in a dictionary which is learned during the denoising. Patch averaging has proven to be an efficient way to combine local sparsity constrain and a global Bayesian treatment and is applied here to process astrophysical image compared to classic wavelet shrinkage and associated techniques.

G.Nurbaeva, F.Courbin, M.Tewes, N.Cantale: Image deconvolution using Hopfield Neural network <slides>
Image deconvolution is a longstanding linear inverse problem with wide ranging applications in many areas. In astronomy, all images are convolved with a Point Spread Function (PSF) and reach the observer blurred and distorted. Recovering the true images is therefore essential for precision astronomy and astrophysics. We present the TVNN (Total Variation using Hopfield Neural Network) method for the deconvolution of astronomical images. Pixels are fed into a Hopfield Neural Network (NN) whose energy function is minimized based on the Total Variation (TV) principle. TV, widely used in image processing, aims at minimizing the integral of the absolute gradient of the signal. One very important research area in cosmology is the study of dark energy and dark matter from weak cosmic gravitational lensing effects. For this purpose, highly accurate measurement of galaxy shapes is crucial and very effective PSF correction techniques are required. We have tested the TVNN deconvolution method on the GREAT10 challenge galaxy images and found it effective at measuring shapes. Its accuracy in doing so mostly depends on the signal-to-Noise Ratio of the PSF kernel used to convolve the image.

A. Ozakin, D. Lee, G. Richards, A. Gray: Nonparametric Estimation with Measurement Errors for Quasar Detection <slides>
Automatic quasar detection is a problem of fundamental importance in modern astronomy. Nonparametric classification techniques based on kernel density estimation (KDE) have been used to develop highly accurate methods of quasar detection, and fast algorithms using space-partitioning trees have made it possible to use these methods on large data sets (Riegel and Gray, 2008). However, astronomical observations come with
estimates of measurement errors due to very different inaccuracies, for example, at different distances – and until now, these estimates have been ignored in the KDE approach to quasar detection though they have been demonstrated to improve the accuracy of recent parametric approaches. If the measurement errors are independent and identically distributed, deconvolution of the density estimate with the known error distribution gives an estimate of the error-free distribution. However, when the error magnitude depends on the data point (i.e., in the case of heteroscedastic errors), straightforward deconvolution does not work. We will describe an extension of KDE that makes use of the estimates of heteroscedastic measurement errors, and a fast algorithm for the evaluation of the relevant sums. We present preliminary results on the Sloan Digital Sky Survey data set.

G. Favole on behalf of the MultiDark-AIP collaboration: The multidark database for cosmological simulations <slides>
We present the online MultiDark Database - a Virtual Observatory-oriented, relational database for hosting various cosmological simulations. The data is accessible via an SQL (Structured Query Language) query interface, which also allows users to directly pose scientic questions. Further examples for the usage of the database are given
in its extensive online documentation. The database is based on the same technology as the Millennium Database, a fact that will greatly facilitate the usage of both suites of cosmological simulations. The first release of the MultiDark Database hosts two 8.6 billion particle cosmological N-body simulations: the Bolshoi (250h^−1 M pc simulation box, 1h^−1 kpc resolution) and MultiDark Run1 simulation (MDR1, or BigBolshoi, 1000h^−1 M pc
simulation box, 7h^−1 kpc resolution). The extraction methods for halos/subhalos from the raw simulation data, and how this data is structured in the database are explained in this paper. With the first data release, users get full access to halo/subhalo catalogs, various proles of the halos at redshifts z = 0 15, and raw dark matter data for one time-step of the Bolshoi and four time-steps of the MultiDark simulation. Later releases will also include
galaxy mock catalogs and additional merging trees for both simulations as well as new large volume simulations with high resolution. This project is further proof of the viability to store and present complex data using relational database technology. We encourage other simulators to publish their results in a similar manner.

Darren Davis,Wayne Hayes: Automatically Extracting Structure fromImages of Spiral Galaxies <slides>
We have created a method for the efficient and automatic extraction of structure from images of spiral galaxies. In particular, we can isolate ”spiral arm segments” by clustering pixels together based on arm segment membership. We can then automatically and objectively extract specific properties of spiral arm segments such as total luminosity, pitch angle (aka winding tightness), and length. This allows us to extract more global properties such as average pitch angle of the arms in a galaxy, winding direction, and existence of bars and rings. As far as we are aware this is a first. Comparisons with the Galaxy Zoo project (human-based classifications) indicate that we agree with humans on the winding direction of the arms at about the same level as humans agree with each other
(more than 95the time). The information that we extract may allow us to answer such interesting questions as how structure evolves with the age of the universe, how structure depends on the local environment in which a galaxy resides, and how its structure changes as a function of the wavelength in which it is observed. The project has already garnered much interest and collaborations with astronomers.

## Spotlight & Poster Presentations

## Spotlight Session I (8.20-8.52)

Kaisey S. Mandel:Type Ia Supernova Inference: Hierarchical Bayesian Statistical Models in the Optical and Near InfraredJo Bovy:Quasar classification and characterization from broadband multi-filter, multi-epoch data setsHeike Modest:Probing non-Gaussianities in the CMB with Minkowski Functionals and Scaling Indices using surrogates

Gregor Rossmanith:Probing non-Gaussianities in the CMB on an incomplete sky using surrogatesJacob Vanderplas, Andrew Connolly, Bhuvnesh Jain:Processing Shear Maps with Karhunen-Loeve AnalysisFranois-Xavier Dupe, Jean-Luc Starck:Galaxy overdensity estimation: toward learning the missing dataMarisa C. March, Roberto Trotta, L. Amendola, D. Huterer:Future dark energy probes and their robustness to systematicsWilliam B. March, Andrew Connolly, Alexander G. Gray:Efficient Estimation of N-point Spatial Statistics## Spotlight Session II (10.00-10.32)

I. Sidorenko, C. Roth:Evaluation of the Topological and Morphological Characteristics of the LSS During Evolution Process by Means of Minkowski Functionals

Adam Gauci, John Abela, Kristian Zarb Adami, Lance Miller:Neural Networks and GREAT10 GalaxiesBarnaby Rowe, Rachel Mandelbaum:The next weak lensing data challengeS. Beckouche, J.-L. Starck, G. Peyre, J. Fadili:Dictionary Learning and Astronomical Image RestorationG.Nurbaeva, F.Courbin, M.Tewes, N.Cantale:Image deconvolution using Hopfield Neural networkA. Ozakin, D. Lee, G. Richards, A. Gray:Nonparametric Estimation with Measurement Errors for Quasar DetectionG. Favole on behalf of the MultiDark-AIP collaboration:The multidark database for cosmological simulationsDarren Davis,Wayne Hayes:Automatically Extracting Structure fromImages of Spiral Galaxies## Abstracts:

Kaisey S. Mandel:Type Ia Supernova Inference: Hierarchical Bayesian Statistical Models in the Optical and Near Infrared <slides>Type Ia supernovae (SN Ia) are the most precise cosmological distance indicators, important for measuring cosmic acceleration and the properties of dark energy. Current and upcoming automated wide-field surveys will find large numbers of SN Ia, but cosmological inferences are already limited by systematic errors. Current cosmological analyses use optical light curves to estimate distances, whose accuracy is limited by the

confounding effects of host galaxy dust extinction. The combination of optical and near infrared (NIR) light curves and spectroscopic data has the potential to improve inference and distance predictions in supernova cosmology. I have constructed a principled, hierarchical Bayesian framework, described by a directed acyclic graph, to coherently model the multiple random and uncertain effects underlying the observed SN Ia data, including measurement error, intrinsic supernova covariances, host galaxy dust extinction and reddening, and distances. An MCMC code, BayeSN, efficiently computes probabilistic inferences for the parameters of individual SN and the hyperparameters of the population. Application to optical, NIR, and spectroscopic data demonstrates that the combination of optical and NIR data approximately doubles the precision of cross-validated SN Ia distance

predictions compared to using optical data alone, and estimates correlations between the intrinsic colors and characteristics of supernova spectral lines.

Jo Bovy:Quasar classification and characterization from broadband multi-filter, multi-epoch data sets <slides><poster>Quasars—actively accreting supermassive black holes—are among the most luminous objects in the Universe. Large samples of quasars can be used to study topics including inflationary cosmology, the evolution of black hole growth over the course of cosmic history, and the physics of astrophysical black hole accretion. One of the major challenges for the peta-scale surveys of the future is to classify and estimate the distances to quasars without the need for expensive spectroscopic follow-up. I will present currently used techniques to classify quasars from broadband photometry, focusing on the XDQSO method—a probabilistic method that uses the extreme-deconvolution density estimation technique to handle missing and highly uncertain data—and a critical appraisal of other machine learning methods currently used. Going forward the major challenges will be to

(1) incorporate variability and astrometric data into the currently used color selection for optimal quasar selection, (2) separate quasars from galaxies (as opposed to stars) as we go to fainter magnitudes, and (3) strike a balance between data-driven, non-parametric methods—which work well for bright quasars—and template-based techniques—necessary for faint quasars where host-galaxy contamination of the observed flux is significant.

Gregor Rossmanith:Probing non-Gaussianities in the CMB on an incomplete sky using surrogates <slides>For investigations of CMB data sets, the analysis of Fourier phases has proven to be a useful method, since all potential higher order correlations, which directly point to non-Gaussianities, are contained in the phases and the correlations among them. The method of surrogate maps with shuffled Fourier phases represents one way of analysing the phases. The shuffling approach relies on the orthogonality of the spherical harmonics, which only holds for the full sky. However, astrophysical foreground emission, mainly present in the Galactic plane, are a major challenge in CMB analyses. We demonstrate the feasibility to generate surrogates by Fourier-based methods also for an incomplete data set by transforming the spherical harmonics into a new set of basis functions that are orthonormal on the cut sky. The results show that non-Gaussianities and hemispherical asymmetries in the CMB as identified in several former investigations, can still be detected even when the complete Galactic plane is removed. We conclude that the Galactic plane cannot be the dominant source for these anomalies.

Heike Modest:Probing non-Gaussianities in the CMB with Minkowski Functionals and Scaling Indices using surrogates<slides>We are analysing the cosmic microwave background (CMB) in respect to possible higher order correlations (HOCs) which would be indicators for non-Gaussianities in the primordial density field of the universe. Here, the analysis of the CMB Fourier phases is a promising and cosmological-model independent method. For generating so-called surrogate maps possible phase correlations of the Fourier phases of the original data (here the CMB map from the WMAP experiment) are destroyed applying a shuffling scheme to the maps in Fourier space. A comparison of the original maps and the surrogate maps then allows to test for the presence of HOCs in the original maps, also and especially on well-defined scales. Using Minkowski Functionals and Scaling Indices as test statistics for the HOCs in the maps we find deviations from the hypothesis of a Gaussian CMB with a significance of up to 10 sigma on largest scales, namely within the Fourier modes l from 0 to 20. We calculate the significance between the test statistics of the original data and the surrogates for different hemispheres in the sky and find hemispherical asymmetries as well as deviations from Gaussianity in the northern and southern sky. Calculating the significance for smaller parts of the sky enables us to locate certain regions in the southern sky that show deviations from Gaussianity while the signal found in the north vanishes.

Jacob Vanderplas, Andrew Connolly, Bhuvnesh Jain:Processing Shear Maps with Karhunen-Loeve Analysis <slides>Wide-field probes of weak gravitational lensing have the potential to address fundamental questions about the nature of the universe. Measures such as the correlation function, power spectrum, or statistics of shear peaks can be compared with theoretical predictions to answer substantive question about the nature of dark matter, dark energy, gravity, and primordial perturbations. Comparison of the data to the theoretical model, however, can be subject to systematic effects due to survey geometry, selection functions, and other biases. This can be framed as a machine learning problem: given a sparse set of noisy observations, how can one best recover the underlying signal of interest? We propose to address these challenges using a compressed-sensing approach based on a Karhunen-Loeve (KL) model of the signal. This approach can efficiently recover the shear signal from

noisy data with arbitrary masking and survey geometry. The signal-to-noise-ranked KL vectors allow effective noise filtration, leading to a 30% decrease in B-mode contamination for simulated data. Furthermore, because the KL model is based on covariance matrices, it naturally encapsulates the two-point information of the field and provides a framework for efficient Bayesian likelihood analysis of the two-point statistics of a cosmological shear

survey.

Franois-Xavier Dupe, Jean-Luc Starck:Galaxy overdensity estimation: toward learning the missing data <slides>As an important tracer of the matter in the universe, galaxy surveys are commonly used to study the matter distribution. However, these surveys present several problems. First they are subject to shot noise (i.e. Poisson noise, because they are counting maps), secondly most of the data in the galactic plane cannot be trusted or

is unavailable and missing data have to be properly taken into account. As we focus on medium scales, we assume that the matter overdensity, and so the galaxy overdensity, follow a log-normal distribution. Using a data augmentation framework, we propose a two-steps method for both denoising and inferring the missing data (inpainting). We begin by filling the missing data of the observation with a texture synthesis algorithm that try to

preserve the observed data and the assumed power spectra. As the texture synthesis is random, we can generate several complete observations (multiple imputations). Then, as we have completed observation, we estimate the galaxy overdensity using a MAP estimator with a log-normal prior and support preservation. Preliminary results are showed using both synthesis and real dataset. Finally, some extension are proposed where we learn

information needed to estimate the overdensity and fill the missing data, while still keeping a strong link the theory.

Marisa C. March, Roberto Trotta, L. Amendola, D. Huterer:Future dark energy probes and their robustness to systematics <slides>We extend the Figure of Merit formalism usually adopted to quantify the statistical performance of future dark energy probes to assess the robustness of a future mission to plausible systematic bias. We introduce a new robustness Figure of Merit which can be computed in the Fisher Matrix formalism given arbitrary systematic biases in the observable quantities. We argue that robustness to systematics is an important new quantity that should be taken into account when optimizing future surveys. We illustrate our formalism with toy examples, and apply it to future type Ia supernova (SNIa) and baryonic acoustic oscillation (BAO) surveys. For the simplified systematic biases that we consider, we find that SNIa are a somewhat more robust probe of dark energy parameters than the BAO. We trace this back to a geometrical alignement of systematic bias direction with statistical degeneracy directions in the dark energy parameter space.

William B. March, Andrew Connolly, Alexander G. Gray:Efficient Estimation of N-point Spatial Statistics <slides>Precise statistical analyses of astronomical data are the key to validating models of complex phenomena, such as dark matter and dark energy. In particular, spatial statistics are needed for large-scale sky catalogs. The n-point correlation functions provide a complete description of any point process and are widely used to understand astronomical data. However, the computational cost of estimating these functions scales as N^n for N

data points. Furthermore, these expensive computations must be repeated many times at many different scales in order to gain a detailed picture of the correlation function and to estimate its variance. Since astronomy surveys contain hundreds of millions or billions of points (and are growing rapidly), these computations are infeasible. We present a new approach based on multidimensional trees to overcome these computational obstacles. We build on the previously most efficient algorithm (Gray and Moore, 2001, Moore, et al., 2001) which improved over the N^n scaling of a direct computation. In this work, we incorporate the computations at different scales along with the variance estimation directly. We can therefore achieve an order of magnitude speedup over the current state-of-the-art method. We show preliminary scaling results on a mock galaxy catalog.

I. Sidorenko, C. Räth:Evaluation of the Topological and Morphological Characteristics of the LSS During Evolution Processby Means of Minkowski Functionals <slides><poster>We study the topology of the cosmic Large-Scale Structures (LSS) produced by Millennium simulations (Springel , V. et al., 2005, Nature 432, 629) by means of Minkowski Functionals (MF). MF provide global morphological and topological characteristics of arbitrary structures. Applied to the density field of dark matter structure they reflect changes during time evolution of LSS. We analyse the simulated dark matter density field smoothed by Gaussian filters with different radii (from r=1.25 Mpc up to r=10 Mpc) at different evolution time (from z=127 up to z=0). We demonstrate that Gaussian smoothing with a large radius (r=10 Mpc) do not properly reflect topological changes in the dark mater structure during the evolution process and destroys filamentary structure of the LSS at present Universe (z=0), which remains present in the density field smoothed by the filters with small radii (r=1.25 Mpc or r=2.5 Mpc). Transformation of the LSS from a nearly random distribution of matter at early stage of the Universe to the filamentary structure at present time corresponds to the onset and increase of an asymmetry in MF_2, MF_3 and MF_4 with respect to the mean density value.

Adam Gauci, John Abela, Kristian Zarb Adami, Lance Miller:Neural Networks and GREAT10 Galaxies <slides><poster>This work investigates the application of artificial neural networks (ANNs) to deblur galaxy postcards of the GREAT10 challenge. High resolution models are created and convolved with a given Point Spread Function (PSF) to generate the corresponding blurred images. These are then downsampled in Fourier space to obtain the resolution used in the challenge. Training examples for the ANN are created from original and the blurred postcards. An n X n, for some odd n, window in a blurred image is compared to the same window in the original images and the ANN learns to output the correct intensity of the middle pixel. This means that the intensities of neighbouring pixels are used in the input vector. Different weightings schemes for translating the output vector from the ANN into pixel values are investigated. The advantages gained by using different window sizes, pixel

encoding methods, and the number of hidden neurons in the ANN are also researched. The chi-squared error between the deblurred image and the original model is used to measure the performance.

Barnaby Rowe, Rachel Mandelbaum:The next weak lensing data challenge <slides>One of the most profound mysteries in modern cosmology is the accelerated expansion of the universe (the discovery of which led to the 2011 physics Nobel Prize). Weak gravitational lensing, an observational method that has the potential to shed the most light on this mystery, relies on accurate measurement of the shapes of millions of galaxies to uncover tiny distortions caused by matter between the galaxies and us. However, accurately inferring the true galaxy shapes is complicated due to large distortions from the atmosphere, telescope optics, detector and pixel noise. As data arrives in greater quantities, requirements on measurement accuracy become more stringent, and weak lensing must now meet unprecedented image analysis challenges. This need has driven ongoing improvements to shape measurement algorithms, and led to the creation of public data analysis challenges, of which the STEP1, STEP2, GREAT08 and GREAT10 challenges are recent examples. Some approaches have been successfully honed and tested by astronomers, but winning entrants have also been found from the machine learning community. In this poster we summarize what has been learned about shape measurement systematics from previous challenges, and highlight critical issues for the field in the near future, which will be tested in the next weak lensing data challenge (currently under development).

S. Beckouche, J.-L. Starck, G. Peyre, J. Fadili:Dictionary Learning and Astronomical Image Restoration <slides>Wavelets have been intensely used for astronomical image restoration during the last 20 years. However, wavelets have shown some limitations for images containing complexe texture features that can find in cosmic string maps or planetary images. We propose to use recently developed dictionary learning techniques to overcome those limitations. We address here the problem where a white gaussian noise is to be removed from

an image. The original image is assumed to be sparsly represented in a dictionary which is learned during the denoising. Patch averaging has proven to be an efficient way to combine local sparsity constrain and a global Bayesian treatment and is applied here to process astrophysical image compared to classic wavelet shrinkage and associated techniques.

G.Nurbaeva, F.Courbin, M.Tewes, N.Cantale:Image deconvolution using Hopfield Neural network <slides>Image deconvolution is a longstanding linear inverse problem with wide ranging applications in many areas. In astronomy, all images are convolved with a Point Spread Function (PSF) and reach the observer blurred and distorted. Recovering the true images is therefore essential for precision astronomy and astrophysics. We present the TVNN (Total Variation using Hopfield Neural Network) method for the deconvolution of astronomical images. Pixels are fed into a Hopfield Neural Network (NN) whose energy function is minimized based on the Total Variation (TV) principle. TV, widely used in image processing, aims at minimizing the integral of the absolute gradient of the signal. One very important research area in cosmology is the study of dark energy and dark matter from weak cosmic gravitational lensing effects. For this purpose, highly accurate measurement of galaxy shapes is crucial and very effective PSF correction techniques are required. We have tested the TVNN deconvolution method on the GREAT10 challenge galaxy images and found it effective at measuring shapes. Its accuracy in doing so mostly depends on the signal-to-Noise Ratio of the PSF kernel used to convolve the image.

A. Ozakin, D. Lee, G. Richards, A. Gray:Nonparametric Estimation with Measurement Errors for Quasar Detection <slides>Automatic quasar detection is a problem of fundamental importance in modern astronomy. Nonparametric classification techniques based on kernel density estimation (KDE) have been used to develop highly accurate methods of quasar detection, and fast algorithms using space-partitioning trees have made it possible to use these methods on large data sets (Riegel and Gray, 2008). However, astronomical observations come with

estimates of measurement errors due to very different inaccuracies, for example, at different distances – and until now, these estimates have been ignored in the KDE approach to quasar detection though they have been demonstrated to improve the accuracy of recent parametric approaches. If the measurement errors are independent and identically distributed, deconvolution of the density estimate with the known error distribution gives an estimate of the error-free distribution. However, when the error magnitude depends on the data point (i.e., in the case of heteroscedastic errors), straightforward deconvolution does not work. We will describe an extension of KDE that makes use of the estimates of heteroscedastic measurement errors, and a fast algorithm for the evaluation of the relevant sums. We present preliminary results on the Sloan Digital Sky Survey data set.

G. Favole on behalf of the MultiDark-AIP collaboration:The multidark database for cosmological simulations <slides>We present the online MultiDark Database - a Virtual Observatory-oriented, relational database for hosting various cosmological simulations. The data is accessible via an SQL (Structured Query Language) query interface, which also allows users to directly pose scientic questions. Further examples for the usage of the database are given

in its extensive online documentation. The database is based on the same technology as the Millennium Database, a fact that will greatly facilitate the usage of both suites of cosmological simulations. The first release of the MultiDark Database hosts two 8.6 billion particle cosmological N-body simulations: the Bolshoi (250h^−1 M pc simulation box, 1h^−1 kpc resolution) and MultiDark Run1 simulation (MDR1, or BigBolshoi, 1000h^−1 M pc

simulation box, 7h^−1 kpc resolution). The extraction methods for halos/subhalos from the raw simulation data, and how this data is structured in the database are explained in this paper. With the first data release, users get full access to halo/subhalo catalogs, various proles of the halos at redshifts z = 0 15, and raw dark matter data for one time-step of the Bolshoi and four time-steps of the MultiDark simulation. Later releases will also include

galaxy mock catalogs and additional merging trees for both simulations as well as new large volume simulations with high resolution. This project is further proof of the viability to store and present complex data using relational database technology. We encourage other simulators to publish their results in a similar manner.

Darren Davis,Wayne Hayes:Automatically Extracting Structure fromImages of Spiral Galaxies <slides>We have created a method for the efficient and automatic extraction of structure from images of spiral galaxies. In particular, we can isolate ”spiral arm segments” by clustering pixels together based on arm segment membership. We can then automatically and objectively extract specific properties of spiral arm segments such as total luminosity, pitch angle (aka winding tightness), and length. This allows us to extract more global properties such as average pitch angle of the arms in a galaxy, winding direction, and existence of bars and rings. As far as we are aware this is a first. Comparisons with the Galaxy Zoo project (human-based classifications) indicate that we agree with humans on the winding direction of the arms at about the same level as humans agree with each other

(more than 95the time). The information that we extract may allow us to answer such interesting questions as how structure evolves with the age of the universe, how structure depends on the local environment in which a galaxy resides, and how its structure changes as a function of the wavelength in which it is observed. The project has already garnered much interest and collaborations with astronomers.