-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdocumentation.tex
228 lines (200 loc) · 11.5 KB
/
documentation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
\documentclass[12pt]{aastex}
\makeatletter
\let\@dates\relax
\makeatother
\usepackage{epsfig,amsfonts,amsmath,url,xcolor,wasysym,wrapfig,booktabs,placeins,float,braket,xspace,soul}
\usepackage[T1]{fontenc}
%\usepackage{setspace}
%\singlespacing
\linespread{1.0}
\begin{document}
\title{SciDBoss Documentation}
\begin{section}{Quasar sample}
I used a testing sample of 5000 quasars from the DR12 catalog (\url{http://www.sdss.org/dr12/algorithms/boss-dr12-quasar-catalog/}).
These quasars were required to have $2.15 < z < 5$ (using $Z\_VI$) and $BAL\_FLAG\_VI = 0$; that is, they were required to have no visually apparent
broad absorption line troughs. From this set I randomly selected 5000 quasars and downloaded the spec files for each of these quasars,
from \url{http://data.sdss3.org/sas/dr12/boss/spectro/redux/ %d/spectra/%04d/spec-%04d-%05d-%04d.fits % (rerun, plate, plate, mjd)}
(see \url{https://www.sdss3.org/dr9/spectro/pipeline.php} for a description, under "per object spec files"). For each quasar I used the coadded data combining each exposure.
These data are stored in the fits file $boss\_test\_sample.fits$ and in the directory $boss\_test\_sample$. The function loading all the data is $load\_all\_data$ within
$lya\_functions.py$. I also mask the data by rejecting any pixel with the $AND\_MASK > 0$ (see \url{http://www.sdss.org/dr12/algorithms/bitmasks/#SPPIXMASK} for a description of the pixel maskbits; the $AND\_MASK$
means that the maskbit was set for any of the exposures within the coadd, not necessarily all). Since the SDSS sky subtraction is often inadequate around bright night
sky lines, I additionally mask 6 \AA\ on either side of the night sky lines at 5577 \AA, 6300 \AA, 6363 \AA, and the ISM sodium D line at 5890 \AA.
The function doing this masking is $mask\_data$ in $lya\_functions.py$.
% should we use the spec-lyas? read Lee paper. i think they were processed differently
\end{section}
\begin{section}{Finding <F(z)>}
As a first step, I fit a model of the following form to each quasar spectrum:
\begin{equation}
f(\lambda) = A_{qso} \bar{C}(\lambda_{r}) <F(z_{abs})>
\label{eqn:model}
\end{equation}
where $\bar{C}(\lambda_{r})$ is the mean quasar continuum, $A_{qso}$ is the amplitude of each individual quasar, and $<F(z_{abs})>$ is the
mean transmitted flux fraction through the Ly$\alpha$ forest. This model assumes that all quasars have the same shape and does not consider
fluctuations from the mean transmission.
Rather than fitting this model in one go I proceed iteratively. In the first iteration I take the mean continuum to be 1:
\begin{equation}
\bar{C}_0(\lambda_{r}) = 1
\label{eqn:c0}
\end{equation}
The mean continuum is defined over a specified grid; for now, the grid starts at 600 \AA, ends at 3600 \AA, and has spacing of 4 \AA\ between points.
I use the Faucher-Giguere 2008 model as a rough estimate for $<F(z)>$ (the analytic fit cited in the Font-Ribera DLA paper):
\begin{equation}
<F(z)_0> = \exp{[-0.0018(1+z)^{3.92}]}
\label{eqn:fz0}
\end{equation}
$<F(z)>$ is defined in a grid where each point has $\Delta z$ = 0.03, and spans a total width of 1.5 in redshift. The beginning of $<F(z)>$ is the
lowest measurable absorption redshift: the blue end SDSS spectroscopic limit, 3566.97 \AA, divided by Ly$\alpha$, 1215.67 \AA, minus 1.
Last, I find the initial quasar amplitude by finding the weighted mean of the quasar flux between 1275 and 1285 \AA.
\begin{equation}
A_{qso,0} = \frac{\sum f_i w_i}{\sum w_i}
\label{eqn:a0}
\end{equation}
$w_i$ is the weight of each pixel, calculated from its total inverse variance:
\begin{equation}
w_i = \left(\sigma_{pipe}^2 + f_i^2 \sigma_{i}^2\right)^{-1}
\label{eqn:wt_for_a0}
\end{equation}
where $\sigma_{pipe}^2$ is the variance outputted by the SDSS pipeline (from the spec file) and $\sigma_{i}^2$ is the intrinsic variance,
a step function
\begin{equation}
\sigma_i^2 =
\begin{cases}
0.1 & \text{if } \lambda_r < 1215.67 \\
0.01 & \text{if } \lambda_r > 1215.67
\end{cases}
\end{equation}
Note that $\sigma_i^2$ is dimensionless. We need $\sigma_{i}^2$ for two reasons.
First, there is intrinsic variance associated with the transmission fraction in the Lyman$\alpha$ forest (due to cosmic structure)
so even a high S/N quasar should have some floor to its variance in the Lyman$\alpha$ forest region.
Second, we need an intrinsic variance to prevent high S/N quasars from being weighted too heavily; if we had no intrinsic variance,
then the mean continuum would be heavily biased towards high S/N quasars. Note also the units of the weight: since the weights are inverse-variance
weights, their units are flux$^{-2}$.
%The other tricky part in this equation is the division
%by $f_{norm}$ to produce a normalized variance. Ideally I would divide $\sigma_{pipe}^2$ by the square of the estimated continuum flux.
%Note that you should not divide $\sigma_{pipe}^2$ by $f_i^2$. In the Lyman-alpha forest, $\sigma_{pipe}^2$ may be $>> f_i^2$ (since some pixels may have close to zero flux).
%This weight would effectively exclude these pixels and bias the estimate of the mean continuum too high. So we must use an estimate
%for the continuum flux instead of $f_i$ in the weights; because no better estimate exists in the first iteration, I simply use $f_{norm}$. in future
%iterations, I will replace $f_{norm}$ with $\bar{C}_i$ from the previous iteration.
Now I begin to iterate. First I compute $\bar{C}$:
\begin{equation}
%\bar{C}(\lambda_r}_{i} = \frac{\sum_j w_j \frac{f_j}{A_{qso, i} \bar{F}(z)_{i,j}}}{\sum_j w_j}
\bar{C}(\lambda_r)_{i} = \frac{\sum_j w_j \frac{f_j}{A_{qso,i-1} \bar{F}(z)_{i-1,j}}}{\sum_j w_j}
\label{eqn:c_i}
\end{equation}
The weights are equal to the total inverse variance of the quantity being averaged
\begin{equation}
w = \left(\frac{\sigma_{pipe}^2}{A_{qso,i-1}^2 <F(z)>_{i-1}^2} + \bar{C}_{i-1}^2 \sigma_i^2\right)^{-1}
\label{eqn:wt_for_c_i}
\end{equation}
Note that I use the $(i-1)$th estimates of $A_{qso}$, $<F(z)>$ and $\bar{C}$.
For any pixel with $z_{abs} > z_{max} + \Delta z/2$, I have no estimate of $<F(z)>_{i-1}$, so I use
the Faucher-Giguere estimate at these redshifts for all iterations.
I also fix the mean continuum to be 1 on average between 1275 and 1285 \AA, in order to break the degeneracy
between mean continuum amplitude and individual quasar amplitude.
In addition, I fix the blue-side slope of the mean continuum in order to break the degeneracy between
blue-side slope and the slope of the mean transmission $<F(z)>$. Following the PCA fits of Paris et al.,
I require the continuum at 1050 \AA\ to be 25\% greater than the continuum at 1280 \AA; in both cases,
the continuum flux at each point is computed by averaging over a box 10 \AA\ in width (e.g. a region 1045 to 1055 \AA\
and a region 1275 to 1285 \AA).
Next I compute $A_{qso}$. Here I use a chisq minimization that is formally identical to the weighted average but is easier to implement
if I switch to using a multi-parameter model for the continuum.
Specifically I only fit to the region red of Ly$\alpha$,
and I use weighted least squares to properly account for the different variances between different pixels; the weights are given by:
\begin{equation}
w = \sqrt{\left(\frac{\sigma_{pipe}^2}{<F(z)>_{i-1}^2} + A_{qso,i-1}^2 \bar{C}_{i}^2 \sigma_i^2\right)^{-1}}
\label{eqn:wt_for_least_squares}
\end{equation}
% (in general a re-read and check-thru of the code is a good idea! make sure that it is consistent with what I've written here)
in other words the inverse of the standard deviation of each pixel, in units of flux$^{-1}$.
Note also that $<F(z)>_{i-1}$ will always be equal to 1 since we only fit red of Ly$\alpha$.
Written in matrix form we then solve
\begin{equation}
W \vec{T} A_{qso} = W \vec{f}
\label{eqn:lls}
\end{equation}
where $W$ is the diagonal matrix of the weights, $\vec{T}$ is the template ($\bar{C}(\lambda_r)_i$ re-binned to the quasar wavelengths), $\vec{f}$ is the measured fluxes,
and $A_{qso}$ is the scalar amplitude that we're fitting for. This is a least squares problem of the form $\vec{a} X = \vec{b}$ so it can be
solved quickly by already-existing libraries to find $A_{qso}$.
Now I find $<F(z)>$. I consider only the region between
1041 \AA\ and 1185 \AA rest frame. I assign each pixel
to the nearest $z_{abs}$ and calculate $<F(z)>$ in the following way:
\begin{equation}
<F(z)>_i = \frac{\sum_j w_j \frac{f_j}{A_{qso,i} \bar{C}_i}}{\sum w_j}
\label{eqn:fz}
\end{equation}
where now $w_j$ is given by
\begin{equation}
w_j = \left(\frac{\sigma_{pipe}^2}{(A_{qso,i} \bar{C}_i)^2} + <F(z)>_{i-1}^2\sigma_{i}^2\right)^{-1}
\label{eqn:w_for_fz}
\end{equation}
I also calculate the error on each measurement of $<F(z)>$:
\begin{equation}
\sigma^2_F = \frac{1}{\sum{w_i}}
\label{eqn:fz_error}
\end{equation}
This follows from the variance of a weighted mean (see appendix). Also, $<F(z)>$ is fixed in order to break the degeneracy
between $<F(z)>$ and $\bar{C}$ in the Ly$\alpha$ forest region. I fix the mean of $<F(z)>$ between $z = 2.2$ and $z = 2.6$
to equal the mean of Faucher-Giguere's measurement of $<F(z)>$ between 2.2 and 2.6.
To measure convergence, I calculate a "chisq" statistic:
\begin{equation}
\chi^2 = \sum \left(\frac{<F(z)>_{i}-<F(z)>_{i-1}}{\sigma_{Fi}}\right)^2
\label{eqn:fz_chisq}
\end{equation}
where the sum runs over the redshift bins, and $i$ refers to the ith iteration. Convergence is reached when this statistic
is less than the number of redshift bins.
\end{section}
\begin{section}{Appendix: variance of a weighted mean}
Consider the estimation of the mean and standard error for a set of data $x_i$ with weights $w_i$.
These weights can
be interpreted as non-integer ``counts'' of each data point. For instance, a weight of 1.3 means that a particular datapoint
is counted 1.3 times.
Take the probability of obtaining a particular measurement $x_i$. Assume this distribution is Gaussian:
\begin{equation}
P_{i}(\mu') \propto \exp{[-\frac{1}{2}(\frac{x_i-\mu'}{\sigma})^2]}
\end{equation}
The probability of observing the entire data set of $N$ observations is given by
\begin{equation}
P(\mu') = \prod^{N}{P_i(\mu')^{w_i}}
\end{equation}
We see that the interpretation of weights as effective counts means that we must raise each probability to the $w_i$ power.
We thus have
\begin{equation}
P(\mu') \propto \exp{[-\frac{1}{2}\sum{w_i(\frac{x_i-\mu'}{\sigma})^2}]}
\label{eqn:prob}
\end{equation}
According to the method of maximum likelihood, the most probable value for the mean is the value that maximizes
the probability in equation~\ref{eqn:prob}. This is equivalent to minimizing the argument of the exponential:
\begin{equation}
X = -\frac{1}{2}\sum{w_i(\frac{x_i-\mu'}{\sigma})^2}
\end{equation}
The minimization condition is
\begin{equation}
\frac{\partial X}{\partial \mu'} = 0
\end{equation}
yielding (after algebraic simplification)
\begin{equation}
\sum{w_i(x_i-\mu')} = 0
\end{equation}
or
\begin{equation}
\mu' =\frac{ \sum{w_i x_i}}{\sum{w_i}}
\end{equation}
This checks with our intuition.
The standard error is given by
\begin{equation}
\sigma_{\mu}^2 = \sum{[\sigma_i^2(\frac{\partial \mu}{\partial x_i})^2]}
\end{equation}
using error propagation. We have
\begin{equation}
\frac{\partial \mu}{\partial x_i} = \frac{\partial}{\partial x_i}(\frac{\sum{w_i x_i}}{\sum{w_i}}) = \frac{w_i}{\sum{w_i}}
\end{equation}
Thus
\begin{equation}
\sigma_{\mu}^2 = \sum{[\sigma_i^2(\frac{w_i}{\sum{w_i}})^2]}
\end{equation}
In our case the weights are equal to the inverse of the variance. Therefore
\begin{equation}
\sigma_{\mu}^2 = \left(\frac{1}{\sum w_i}\right)^2 \sum \frac{w_i^2}{w_i} = \frac{1}{\sum w_i}
\end{equation}
\end{section}
\end{document}